This pull request addresses and fixes the issues associated with commit versions.
Previously, relying on commit versions posed problems, with some being merged and potentially causing functionality issues.
Furthermore, this update incorporates a detailed comment to assist users in effectively leveraging the code in future versions.
Previously, the code was triggering the following error:
opt/conda/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:389: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.1` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:394: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.1` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
warnings.warn(
../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [262,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [262,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [262,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [262,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1292: indexSelectLargeIndex: block: [262,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
File /opt/conda/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py:343, in _prepare_4d_causal_attention_mask_for_sdpa(attention_mask, input_shape, inputs_embeds, past_key_values_length, sliding_window)
340 is_tracing = torch.jit.is_tracing()
342 if attention_mask is not None:
--> 343 if torch.all(attention_mask == 1):
344 if is_tracing:
345 pass
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
This pull request addresses and fixes the issues associated with commit versions.
Previously, relying on commit versions posed problems, with some being merged and potentially causing functionality issues.
Furthermore, this update incorporates a detailed comment to assist users in effectively leveraging the code in future versions.
Previously, the code was triggering the following error: