Open iMountTai opened 1 year ago
Hi @iMountTai, thank you for reaching out. Could you please share with us your current setup? Are you using the artifacts as provided in the quickstart or have you modified them?
I first downloaded the pre-training weights, such as gpt2-large and opt-1.3B. Basically unchanged, except that the model is loaded with the downloaded weights.The previous output log is as follows env: torch1.12cu113,transformers4.26.0
Hi @iMountTai, I have just a few questions.
same error. python artifacts/main.py artifacts/config/config.yaml --type ACTOR
in reward json file, the score is None, so i change it into "0.01".
NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 torch-1.13.0+cu117 pytorch-cuda 11.7
File "/home/hupenglong/nebullvm/apps/accelerate/chatllama/chatllama/rlhf/actor.py", line 373, in train
est_output = self.model(training_input, attention_mask)
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, kwargs)
File "<@beartype(chatllama.rlhf.actor.ActorModel.forward) at 0x7fcb26cf6b80>", line 51, in forward
File "/home/hupenglong/nebullvm/apps/accelerate/chatllama/chatllama/rlhf/actor.py", line 114, in forward
model_output = self.model.forward(
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 934, in forward
outputs = self.model.decoder(
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, *kwargs)
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 698, in forward
layer_outputs = decoder_layer(
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(input, kwargs)
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 327, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, kwargs)
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/transformers/models/opt/modeling_opt.py", line 172, in forward
query_states = self.q_proj(hidden_states) self.scaling
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(input, kwargs)
File "/home/hupenglong/anaconda3/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasLtMatmul( ltHandle, computeDesc.descriptor(), &alpha_val, mat1_ptr, Adesc.descriptor(), mat2_ptr, Bdesc.descriptor(), &beta_val, result_ptr, Cdesc.descriptor(), result_ptr, Cdesc.descriptor(), &heuristicResult.algo, workspace.data_ptr(), workspaceSize, at::cuda::getCurrentCUDAStream())
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize
failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize
failed.
in reward json file, the score is None, so i change it into "0.01".
Hey @lonelydancer, as far as I can tell, the reward dataset doesn't play a role in actor pre-training. So, I don't think changing the score value to "0.01" is what caused the assertion error
I also got the assertion error. I ran python artifacts/main.py artifacts/config/config.yaml --type ACTOR
and got the following error (I truncated some of the assertion errors)
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [108,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [109,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [110,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [111,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [112,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [113,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [114,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
.
.
.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "/home/anirudh/rlhf/artifacts/main.py", line 51, in <module>
actor_trainer.train()
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/chatllama/rlhf/actor.py", line 373, in train
est_output = self.model(training_input, attention_mask)
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "<@beartype(chatllama.rlhf.actor.ActorModel.forward) at 0x7f30171c2290>", line 51, in forward
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/chatllama/rlhf/actor.py", line 114, in forward
model_output = self.model.forward(
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 932, in forward
outputs = self.model.decoder(
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 639, in forward
attention_mask = self._prepare_decoder_attention_mask(
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 542, in _prepare_decoder_attention_mask
expanded_attn_mask = _expand_mask(attention_mask, inputs_embeds.dtype, tgt_len=input_shape[-1]).to(
File "/home/anirudh/chatllama_test/venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 89, in _expand_mask
expanded_mask = mask[:, None, None, :].expand(bsz, 1, tgt_len, src_len).to(dtype)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
This repo is super cool and it would be great if I could get some help!
^ update (just for more context)- load_model_test
works! I believe this method uses a HF GPT2 tokenizer. However, if I use the load_model
method, I get this assertion error. I also tried the initial version of llama_model.py (the one with a test
flag) and ran into the assertion error problem when I set test to False
(basically the same as calling load_model)
I think this is the dataset/tokenizer problem. I use the llama tokenizer and opt-1.3b model, get this error at index=815/816. So what tokenizer should I use for opt-1.3b ? The data:
[{"user_input": "New research suggests that metrics that are used to measure academic success, such as the number of publications, number of citations, and impact factor, have become targets and follow Goodhart\u2019s Law, according to which, \u201cwhen a measure becomes a target, it ceases to be a good measure.\u201d The original study can be found here: https://academic.oup.com/gigascience/article/8/6/giz053/5506490 Abstract: **Background** > The academic publishing world is changing significantly, with ever-growing numbers of publications each year and shifting publishing patterns. However, the metrics used to measure academic success, such as the number of publications, citation number, and impact factor, have not changed for decades. Moreover, recent studies indicate that these metrics have become targets and follow Goodhart\u2019s Law, according to which, \u201cwhen a measure becomes a target, it ceases to be a good measure.\u201d **Results** > In this study, we analyzed >120 million papers to examine how the academic publishing world has evolved over the last century, with a deeper look into the specific field of biology. Our study shows that the validity of citation-based measures is being compromised and their usefulness is lessening. In particular, the number of publications has ceased to be a good metric as a result of longer author lists, shorter papers, and surging publication numbers. Citation-based metrics, such citation number and h-index, are likewise affected by the flood of papers, self-citations, and lengthy reference lists. Measures such as a journal\u2019s impact factor have also ceased to be good metrics due to the soaring numbers of papers that are published in top journals, particularly from the same pool of authors. Moreover, by analyzing properties of >2,600 research fields, we observed that citation-based metrics are not beneficial for comparing researchers in different fields, or even in the same department. **Conclusions** > Academic publishing has changed considerably; now we need to reconsider how we measure success.", "completion": "This \u201cstudy\u201d is a hot steaming pile garbage and certainly doesn\u2019t show what it claims to. A couple overarching massive flaws: 1. They don\u2019t compare changes in # of papers or # of coauthors over time to changes in population. Their plot of # of papers looks a lot like a plot of the # of people in the world. At no point in the paper do they control for the effect of the increased number of researchers. At no point do they address the sharp decline in the past 15 years. 2. They make no effort to measure the quality of a paper, and yet claim that people are optimizing for metrics instead of improving. You need to actually give an indication that the quality of research isn\u2019t increasing to show that. 3. Their data analysis practice is abysmal. They misleadingly cut axes and rescale plots. They don\u2019t measure correlation, statistical significance, or effect size. They don\u2019t discuss the fact that the trends they\u2019re talking about sometimes only exist in some time periods and not others. They put Plot 10 in a paper, which is a horrible abomination that I still can\u2019t figure out how to read. Now for a point-by-point refutation.... >First, these results support Goodhart\u2019s Law as it relates to academic publishing: the measures (e.g., number of papers, number of citations, h-index, and impact factor) have become targets, and now they are no longer good measures. By making papers shorter and collaborating with more authors, researchers are able to produce more papers in the same amount of time. Moreover, we observed that the majority of changes in papers\u2019 properties are correlated with papers that receive higher numbers of citations (see Fig.\u00a0S13). Authors can use longer titles and abstracts, or use question or exclamation marks in titles, to make their papers more appealing. Thus, more readers are attracted to the paper, and ideally they will cite it, i.e., academic clickbait\u00a0[45]. These results support our hypothesis that the citation number has become a target. Consequently, the properties of academic papers have evolved in order to win\u2014to score a bullseye on the academic target. Of course the number of citations and coauthors has gone up, there are more researchers and papers than ever before! At no point did you ever present # of papers per author per year, so don\u2019t make any claims about it. I am 100% sure you could get the actual number in one line of code, and the fact that you choose not to when it\u2019s central to your narrative is a massive red flag. Additionally, you never consider the alternative hypothesis that writing quality of research papers has improved. Many research papers from the early 1900s are awful to read, incredibly dry, tedious, and poorly written. The fact that papers are shorter and more willing to engage in common rhetorical techniques like using questions in titles is a good thing in my mind. Writing papers that people want to read means writing better research papers. The analysis in this paper in no way demonstrates Goodhart\u2019s Law. To do that you need to show that an increase in metric is not due to an increase in performance. >It is worth noting that while the study\u2019s results provide evidence that many citation-based measures have become targets, there also may be other factors that influence academic publication trends. For example, the academic hypercompetitive environment itself may prompt an increase in productivity\u00a0[81], hence increasing the number of papers. However, this claim contradicts the findings of Fanelli and Larivi\u00e8re that researchers\u2019 individual productivity did not increase in the past century\u00a0[52]. Nevertheless, it is important to keep in mind that there may be other underlying factors that contributed to the observed results. Gee, you think? Shame other than this throwaway paragraph you don\u2019t actually seriously consider any alternatives to the narrative you\u2019re peddling. >Second, we observed that over time fewer papers list authors alphabetically, especially papers with a relatively high number of authors (see Results of Paper Trends section and Figs\u00a04 and\u00a0S5). These results may indicate the increased importance of an author\u2019s sequence number in the author list, which may reflect the author\u2019s contribution to the study. This result is another signal of the increasing importance of measures that rate an individual\u2019s research contribution. This isn\u2019t something you can meaningfully measure across all fields. Different fields have different conventions. Virtually every mathematics paper in the world today has alphabetical authorship or random authorship. In any event, this is barely a discernible trend in your plot and you made no effort to quantify the effect size or show that it was statistically significant. In the past 40 years, it looks like the needle has barely moved at all in terms of % of papers that have alphabetical authorship (Fig. 4). >Third, from matching papers to their L0 fields of study, we observed that the number of multidisciplinary papers has increased sharply over time (see Fig.\u00a06). It is important to keep in mind that these results were obtained by matching keywords to their corresponding fields of study. Therefore, these results have several limitations: first, not all papers contain keywords. Second, the dataset may not extract keywords from papers in the correct manner. For example, we found some papers contained keywords in their online version but not in their offline version (see Results of Paper Trends section). It is also possible that in some fields it is less common to use keywords. Therefore, the papers\u2019 keywords may be missing in the datasets, and the presented results may be an underestimate of the actual number of multidisciplinary studies. Nevertheless, we observed a strong trend in increasing numbers of multidisciplinary papers. This is totally disconnected from the rest of the analysis and never mentioned again. Is it supposed to be meaningful? >Fourth, from seeing sharp increases in both the maximal and mean number of self-citations (see Results of Paper Trends section and Figs\u00a07,\u00a09,\u00a010, and\u00a0S12), it is clear that citation numbers have become a target for some researchers, who cite their own papers dozens, or even hundreds, of times. Furthermore, we observed a general increasing trend for researchers to cite their previous work in their new studies. Moreover, from analyzing the percentage of papers without citations after 5\u00a0years, we observed that a huge quantity of papers (>72% of all papers and 25% of all papers with \u22655 references) have no citations at all (see Fig.\u00a09). Obviously, many resources are spent on papers with limited impact. The lack of citations may indicate that researchers are publishing more papers of poorer quality to boost their total number of publications. Additionally, by exploring papers\u2019 citation distributions (see Fig.\u00a010), we can observe that different decades have very different citation distributions. This result indicates that comparing citation records of researchers who published papers during different periods can be challenging. No, it\u2019s not clear that people are deliberately citing themselves to game research metrics. Self-citation will increase with no change in behavior if the number of people per paper increases, which you\u2019ve already shown. Controlling for this is a basic prerequisite for drawing a meaningful inference, and it\u2019s a shame you didn\u2019t do that. It will also increase with increased specialization, since if I write 5 papers on a single topic and you write 5 papers on 5 different topics it is quite reasonable to assume my papers will show more self-citation than yours. You did show that the percentage of interdisciplinary papers is increasing, this might be a good place to reference that fact. In order to make claims about \u201csome researchers\u201d you need to actually track to see if that\u2019s a stable group of people over time or not. Shame you didn\u2019t bother to do that either. This is data analysis 101, and you\u2019re failing it. You\u2019re also failing it in other ways, such as changing the scale of plots that are being compared and cutting axes off arbitrarily. What Figure 9 actually shows is that the percentage of papers with no citations in 5 years has plummeted. I don\u2019t understand how they can possibly write this with a straight face. Both of these numbers hit their all-time low between 2000 and today. In 1900 it was ~98% and today it\u2019s ~75%. That\u2019s good, right? Why are you presenting this like it\u2019s a bad thing?", "score": null}]
opt-1.3b is from hugging face: models--facebook--opt-1.3b
From the official paper:
We tokenize all corpora using the GPT-2 byte level BPE tokenizer (Sennrich et al., 2016; Radfordet al., 2019; Brown et al., 2020). Our final corpus contains roughly 180B tokens.
This might solve your problem:
tokenizer_model_path
param (https://huggingface.co/facebook/opt-1.3b)load_model
to load_mdel_test
in actor.py - this loads in a tokenizer from HF
https://github.com/nebuly-ai/nebullvm/blob/d16dfd8312218540fbdc633222e4655480040778/apps/accelerate/chatllama/chatllama/rlhf/actor.py#L52
https://github.com/nebuly-ai/nebullvm/blob/d16dfd8312218540fbdc633222e4655480040778/apps/accelerate/chatllama/chatllama/rlhf/actor.py#L58i truncate the input data to the max_length, there is no such error.
@lonelydancer is it max_length
number of characters or words?
Hi Everyone! @lonelydancer @bnuzhanyu @iMountTai @AAnirudh07 This problems should have been fixed in the PR #306 . Let me know if you are still struggling.
Amazing work. But may I ask how to solve the following problem?