[Chatllama] BUG?Assertion srcIndex < srcSelectDimSize failed.

iMountTai commented 1 year ago

Amazing work. But may I ask how to solve the following problem?

diegofiori commented 1 year ago

Hi @iMountTai, thank you for reaching out. Could you please share with us your current setup? Are you using the artifacts as provided in the quickstart or have you modified them?

iMountTai commented 1 year ago

I first downloaded the pre-training weights, such as gpt2-large and opt-1.3B. Basically unchanged, except that the model is loaded with the downloaded weights.The previous output log is as follows env: torch1.12cu113,transformers4.26.0

PierpaoloSorbellini commented 1 year ago

Hi @iMountTai, I have just a few questions.

What reward model are you using? the default one?
What about the dataset? The one from the Quickstart? Thank you for you patience we are working on a solution.

iMountTai commented 1 year ago

Yes!
Yes!

lonelydancer commented 1 year ago

same error. python artifacts/main.py artifacts/config/config.yaml --type ACTOR

in reward json file， the score is None, so i change it into "0.01".

NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 torch-1.13.0+cu117 pytorch-cuda 11.7

File "/home/hupenglong/nebullvm/apps/accelerate/chatllama/chatllama/rlhf/actor.py", line 373, in train est_output = self.model(training_input, attention_mask) File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "<@beartype(chatllama.rlhf.actor.ActorModel.forward) at 0x7fcb26cf6b80>", line 51, in forward File "/home/hupenglong/nebullvm/apps/accelerate/chatllama/chatllama/rlhf/actor.py", line 114, in forward model_output = self.model.forward( File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 934, in forward outputs = self.model.decoder( File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 698, in forward layer_outputs = decoder_layer( File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 327, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 172, in forward query_states = self.q_proj(hidden_states) self.scaling File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasLtMatmul( ltHandle, computeDesc.descriptor(), &alpha_val, mat1_ptr, Adesc.descriptor(), mat2_ptr, Bdesc.descriptor(), &beta_val, result_ptr, Cdesc.descriptor(), result_ptr, Cdesc.descriptor(), &heuristicResult.algo, workspace.data_ptr(), workspaceSize, at::cuda::getCurrentCUDAStream()) ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed.

AAnirudh07 commented 1 year ago

in reward json file， the score is None, so i change it into "0.01".

Hey @lonelydancer, as far as I can tell, the reward dataset doesn't play a role in actor pre-training. So, I don't think changing the score value to "0.01" is what caused the assertion error

AAnirudh07 commented 1 year ago

I also got the assertion error. I ran python artifacts/main.py artifacts/config/config.yaml --type ACTOR and got the following error (I truncated some of the assertion errors)

../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [108,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [109,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [110,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [111,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [112,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [113,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [114,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
.
.
.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
  File "/home/anirudh/rlhf/artifacts/main.py", line 51, in <module>
    actor_trainer.train()
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/chatllama/rlhf/actor.py", line 373, in train
    est_output = self.model(training_input, attention_mask)
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "<@beartype(chatllama.rlhf.actor.ActorModel.forward) at 0x7f30171c2290>", line 51, in forward
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/chatllama/rlhf/actor.py", line 114, in forward
    model_output = self.model.forward(
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 932, in forward
    outputs = self.model.decoder(
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 639, in forward
    attention_mask = self._prepare_decoder_attention_mask(
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 542, in _prepare_decoder_attention_mask
    expanded_attn_mask = _expand_mask(attention_mask, inputs_embeds.dtype, tgt_len=input_shape[-1]).to(
  File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 89, in _expand_mask
    expanded_mask = mask[:, None, None, :].expand(bsz, 1, tgt_len, src_len).to(dtype)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

This repo is super cool and it would be great if I could get some help!

AAnirudh07 commented 1 year ago

^ update (just for more context)- load_model_test works! I believe this method uses a HF GPT2 tokenizer. However, if I use the load_model method, I get this assertion error. I also tried the initial version of llama_model.py (the one with a test flag) and ran into the assertion error problem when I set test to False (basically the same as calling load_model)

bnuzhanyu commented 1 year ago

I think this is the dataset/tokenizer problem. I use the llama tokenizer and opt-1.3b model, get this error at index=815/816. So what tokenizer should I use for opt-1.3b ? The data:

[{"user_input": "New research suggests that metrics that are used to measure academic success, such as the number of publications, number of citations, and impact factor, have become targets and follow Goodhart\u2019s Law, according to which, \u201cwhen a measure becomes a target, it ceases to be a good measure.\u201d The original study can be found here: https://academic.oup.com/gigascience/article/8/6/giz053/5506490  Abstract:  **Background**  > The academic publishing world is changing significantly, with ever-growing numbers of publications each year and shifting publishing patterns. However, the metrics used to measure academic success, such as the number of publications, citation number, and impact factor, have not changed for decades. Moreover, recent studies indicate that these metrics have become targets and follow Goodhart\u2019s Law, according to which, \u201cwhen a measure becomes a target, it ceases to be a good measure.\u201d  **Results**  > In this study, we analyzed >120 million papers to examine how the academic publishing world has evolved over the last century, with a deeper look into the specific field of biology. Our study shows that the validity of citation-based measures is being compromised and their usefulness is lessening. In particular, the number of publications has ceased to be a good metric as a result of longer author lists, shorter papers, and surging publication numbers. Citation-based metrics, such citation number and h-index, are likewise affected by the flood of papers, self-citations, and lengthy reference lists. Measures such as a journal\u2019s impact factor have also ceased to be good metrics due to the soaring numbers of papers that are published in top journals, particularly from the same pool of authors. Moreover, by analyzing properties of >2,600 research fields, we observed that citation-based metrics are not beneficial for comparing researchers in different fields, or even in the same department.  **Conclusions**  > Academic publishing has changed considerably; now we need to reconsider how we measure success.", "completion": "This \u201cstudy\u201d is a hot steaming pile garbage and certainly doesn\u2019t show what it claims to. A couple overarching massive flaws:  1. They don\u2019t compare changes in # of papers or # of coauthors over time to changes in population. Their plot of # of papers looks a lot like a plot of the # of people in the world. At no point in the paper do they control for the effect of the increased number of researchers. At no point do they address the sharp decline in the past 15 years. 2. They make no effort to measure the quality of a paper, and yet claim that people are optimizing for metrics instead of improving. You need to actually give an indication that the quality of research isn\u2019t increasing to show that. 3. Their data analysis practice is abysmal. They misleadingly cut axes and rescale plots. They don\u2019t measure correlation, statistical significance, or effect size. They don\u2019t discuss the fact that the trends they\u2019re talking about sometimes only exist in some time periods and not others. They put Plot 10 in a paper, which is a horrible abomination that I still can\u2019t figure out how to read.  Now for a point-by-point refutation....  >First, these results support Goodhart\u2019s Law as it relates to academic publishing: the measures (e.g., number of papers, number of citations, h-index, and impact factor) have become targets, and now they are no longer good measures. By making papers shorter and collaborating with more authors, researchers are able to produce more papers in the same amount of time. Moreover, we observed that the majority of changes in papers\u2019 properties are correlated with papers that receive higher numbers of citations (see Fig.\u00a0S13). Authors can use longer titles and abstracts, or use question or exclamation marks in titles, to make their papers more appealing. Thus, more readers are attracted to the paper, and ideally they will cite it, i.e., academic clickbait\u00a0[45]. These results support our hypothesis that the citation number has become a target. Consequently, the properties of academic papers have evolved in order to win\u2014to score a bullseye on the academic target.  Of course the number of citations and coauthors has gone up, there are more researchers and papers than ever before! At no point did you ever present # of papers per author per year, so don\u2019t make any claims about it. I am 100% sure you could get the actual number in one line of code, and the fact that you choose not to when it\u2019s central to your narrative is a massive red flag.  Additionally, you never consider the alternative hypothesis that writing quality of research papers has improved. Many research papers from the early 1900s are awful to read, incredibly dry, tedious, and poorly written. The fact that papers are shorter and more willing to engage in common rhetorical techniques like using questions in titles is a good thing in my mind. Writing papers that people want to read means writing better research papers.  The analysis in this paper in no way demonstrates Goodhart\u2019s Law. To do that you need to show that an increase in metric is not due to an increase in performance.  >It is worth noting that while the study\u2019s results provide evidence that many citation-based measures have become targets, there also may be other factors that influence academic publication trends. For example, the academic hypercompetitive environment itself may prompt an increase in productivity\u00a0[81], hence increasing the number of papers. However, this claim contradicts the findings of Fanelli and Larivi\u00e8re that researchers\u2019 individual productivity did not increase in the past century\u00a0[52]. Nevertheless, it is important to keep in mind that there may be other underlying factors that contributed to the observed results.  Gee, you think? Shame other than this throwaway paragraph you don\u2019t actually seriously consider any alternatives to the narrative you\u2019re peddling.  >Second, we observed that over time fewer papers list authors alphabetically, especially papers with a relatively high number of authors (see Results of Paper Trends section and Figs\u00a04 and\u00a0S5). These results may indicate the increased importance of an author\u2019s sequence number in the author list, which may reflect the author\u2019s contribution to the study. This result is another signal of the increasing importance of measures that rate an individual\u2019s research contribution.  This isn\u2019t something you can meaningfully measure across all fields. Different fields have different conventions. Virtually every mathematics paper in the world today has alphabetical authorship or random authorship. In any event, this is barely a discernible trend in your plot and you made no effort to quantify the effect size or show that it was statistically significant. In the past 40 years, it looks like the needle has barely moved at all in terms of % of papers that have alphabetical authorship (Fig. 4).  >Third, from matching papers to their L0 fields of study, we observed that the number of multidisciplinary papers has increased sharply over time (see Fig.\u00a06). It is important to keep in mind that these results were obtained by matching keywords to their corresponding fields of study. Therefore, these results have several limitations: first, not all papers contain keywords. Second, the dataset may not extract keywords from papers in the correct manner. For example, we found some papers contained keywords in their online version but not in their offline version (see Results of Paper Trends section). It is also possible that in some fields it is less common to use keywords. Therefore, the papers\u2019 keywords may be missing in the datasets, and the presented results may be an underestimate of the actual number of multidisciplinary studies. Nevertheless, we observed a strong trend in increasing numbers of multidisciplinary papers.  This is totally disconnected from the rest of the analysis and never mentioned again. Is it supposed to be meaningful?  >Fourth, from seeing sharp increases in both the maximal and mean number of self-citations (see Results of Paper Trends section and Figs\u00a07,\u00a09,\u00a010, and\u00a0S12), it is clear that citation numbers have become a target for some researchers, who cite their own papers dozens, or even hundreds, of times. Furthermore, we observed a general increasing trend for researchers to cite their previous work in their new studies. Moreover, from analyzing the percentage of papers without citations after 5\u00a0years, we observed that a huge quantity of papers (>72% of all papers and 25% of all papers with \u22655 references) have no citations at all (see Fig.\u00a09). Obviously, many resources are spent on papers with limited impact. The lack of citations may indicate that researchers are publishing more papers of poorer quality to boost their total number of publications. Additionally, by exploring papers\u2019 citation distributions (see Fig.\u00a010), we can observe that different decades have very different citation distributions. This result indicates that comparing citation records of researchers who published papers during different periods can be challenging.  No, it\u2019s not clear that people are deliberately citing themselves to game research metrics. Self-citation will increase with no change in behavior if the number of people per paper increases, which you\u2019ve already shown. Controlling for this is a basic prerequisite for drawing a meaningful inference, and it\u2019s a shame you didn\u2019t do that. It will also increase with increased specialization, since if I write 5 papers on a single topic and you write 5 papers on 5 different topics it is quite reasonable to assume my papers will show more self-citation than yours. You did show that the percentage of interdisciplinary papers is increasing, this might be a good place to reference that fact.  In order to make claims about \u201csome researchers\u201d you need to actually track to see if that\u2019s a stable group of people over time or not. Shame you didn\u2019t bother to do that either. This is data analysis 101, and you\u2019re failing it. You\u2019re also failing it in other ways, such as changing the scale of plots that are being compared and cutting axes off arbitrarily.  What Figure 9 actually shows is that the percentage of papers with no citations in 5 years has plummeted. I don\u2019t understand how they can possibly write this with a straight face. Both of these numbers hit their all-time low between 2000 and today. In 1900 it was ~98% and today it\u2019s ~75%. That\u2019s good, right? Why are you presenting this like it\u2019s a bad thing?", "score": null}]

bnuzhanyu commented 1 year ago

opt-1.3b is from hugging face: models--facebook--opt-1.3b

AAnirudh07 commented 1 year ago

From the official paper:

We tokenize all corpora using the GPT-2 byte level BPE tokenizer (Sennrich et al., 2016; Radfordet al., 2019; Brown et al., 2020). Our final corpus contains roughly 180B tokens.

This might solve your problem:

"facebook/opt-1.3b" in the tokenizer_model_path param (https://huggingface.co/facebook/opt-1.3b)
Change load_model to load_mdel_test in actor.py - this loads in a tokenizer from HF https://github.com/nebuly-ai/nebullvm/blob/d16dfd8312218540fbdc633222e4655480040778/apps/accelerate/chatllama/chatllama/rlhf/actor.py#L52 https://github.com/nebuly-ai/nebullvm/blob/d16dfd8312218540fbdc633222e4655480040778/apps/accelerate/chatllama/chatllama/rlhf/actor.py#L58

lonelydancer commented 1 year ago

i truncate the input data to the max_length, there is no such error.

AAnirudh07 commented 1 year ago

@lonelydancer is it max_length number of characters or words?

PierpaoloSorbellini commented 1 year ago

Hi Everyone! @lonelydancer @bnuzhanyu @iMountTai @AAnirudh07 This problems should have been fixed in the PR #306 . Let me know if you are still struggling.

nebuly-ai / optimate

[Chatllama] BUG?Assertion srcIndex < srcSelectDimSize failed. #234