jeffbinder / visions-and-revisions

Neural network poetry rewriter
21 stars 4 forks source link

Issues with 2021 version(s) of 'visions.py' not present in 2020 versions #3

Closed GenTxt closed 3 years ago

GenTxt commented 3 years ago

Hi Jeff:

Hope all is well. I've enjoyed working with your repo and I was looking forward to testing deberta and deberta-v2 models but unfortunately I've hit a few snags. I've had no issues with the 2020 versions including use of custom bert checkpoints but I'm encountering errors now which I suspect are missing dependencies in my Ubuntu 18.04 python3.6/3.7 environment

I would like to set up a virtual python environment that best mimics what works at your end.

In addition to the above I'm running:

transformers V4.4.2 torch 1.7.0+cu101 (python3.6/3.7) NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2

Can you suggest any changes to the above that might improve performance?

Using the Feb. 26 version (same errors with latest)

Everything starts well as per previous scripts.

CUDA_VISIBLE_DEVICES=0 python3.7 banalify_deBerta.py (your banalify.py script)

Generating deberta_meter_dict.pkl

The following are the errors I've encountered. I've commented out some of the lines and that appeared to work until the last error

visions.py", line 847, in depoeticize modifier = modifier() TypeError: 'bool' object is not callable

Changed to: modifier = None #modifier()

Note: Didn't manage to get 'metalness.json' working in original scripts.

visions.py", line 958, in depoeticize discouraged_words[idx] = discourage_repetition TypeError: can't assign a list to a torch.FloatTensor

Commented out entire snippet starting here

Discourage the selection of words already in the text, save for stopwords.

Run again ... no errors above then ...

visions.py", line 458, in compute_probs_for_masked_tokens nbatches = math.ceil(ntexts / batch_size) ZeroDivisionError: division by zero

Hoping a correctly configured virtual environment will solve these issues.

Cheers

jeffbinder commented 3 years ago

Thanks for reporting the problem! I'm swamped with work at the moment but will check this out in a few days.

jeffbinder commented 3 years ago

Hi there,

I just pushed some major changes that should address at least most of these errors. I'm sorry this took so long. April was a crazy month for me, and fixing the problems turned out to take a couple days' work! The DeBERTa tokenizer seems to handle some spacing and punctuation situations differently from the others, and I had to redo most of the banalify function to get things working right. If you're still getting errors, please (if you don't mind) post the exact code that you're running along with some input text that produces the error.

The banalify function still has some problems related to how the tokenizer handles certain sequences of character, so it may fail depending on the exact content of your input. I would recommend sticking with ASCII, because the tokenizers will sometimes mangle Unicode.

I have been using nightly builds of torch and a modified version of the git master version of transformers. You may be able to get better performance if you use a torch build compiled with an 11.x version of CUDA, although it depends on your GPU.

Getting DeBERTa to work on your system will likely take some doing at present. In the current master version of transformers, DebertaForMaskedLM doesn't load the MLM head weights from the pretrained model because their names don't line up with how they're implemented in the class. If you try to use it, you will get a warning like this:

Some weights of the model checkpoint at microsoft/deberta-base were not used when initializing DebertaForMaskedLM: ['lm_predictions.lm_head.bias', 'lm_predictions.lm_head.dense.weight', 'lm_predictions.lm_head.dense.bias', 'lm_predictions.lm_head.LayerNorm.weight', 'lm_predictions.lm_head.LayerNorm.bias', 'config', 'deberta.embeddings.position_embeddings.weight']

If you don't fix this problem, the results will be gibberish. The weights are in there, though, and you can load them by applying the following patch to transformers:

diff --git a/src/transformers/modeling_utils.py b/src/transformers/modeling_utils.py
index 66875a028..1c4af42dc 100755
--- a/src/transformers/modeling_utils.py
+++ b/src/transformers/modeling_utils.py
@@ -1180,6 +1180,16 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix
                     new_key = key.replace("gamma", "weight")
                 if "beta" in key:
                     new_key = key.replace("beta", "bias")
+                if key == "lm_predictions.lm_head.bias":
+                    new_key = "cls.predictions.bias"
+                if key == "lm_predictions.lm_head.dense.weight":
+                    new_key = "cls.predictions.transform.dense.weight"
+                if key == "lm_predictions.lm_head.dense.bias":
+                    new_key = "cls.predictions.transform.dense.bias"
+                if key == "lm_predictions.lm_head.LayerNorm.weight":
+                    new_key = "cls.predictions.transform.LayerNorm.weight"
+                if key == "lm_predictions.lm_head.LayerNorm.bias":
+                    new_key = "cls.predictions.transform.LayerNorm.bias"
                 if new_key:
                     old_keys.append(key)
                     new_keys.append(new_key)

You will still get a warning about the deberta.embeddings.position_embeddings.weight, but that doesn't seem to be a problem because the model uses the same embeddings for input and output. As far as I can tell, both versions of DeBERTa work properly with this patch, and the big ones seem to produce significantly more coherent output than the other models.

GenTxt commented 3 years ago

Hi Jeff:

Thanks for the update. Previous issues have been fixed and all that remains is applying the patch. Unfortunately I'm not familiar with using git diff. Would appreciate the commands to apply above.

Cheers

On Mon, May 3, 2021 at 12:34 PM Jeffrey M. Binder @.***> wrote:

Hi there,

I just pushed some major changes that should address at least most of these errors. I'm sorry this took so long. April was a crazy month for me, and fixing the problems turned out to take a couple days' work! The DeBERTa tokenizer seems to handle some spacing and punctuation situations differently from the others, and I had to redo most of the banalify function to get things working right. If you're still getting errors, please (if you don't mind) post the exact code that you're running along with some input text that produces the error.

The banalify function still has some problems related to how the tokenizer handles certain sequences of character, so it may fail depending on the exact content of your input. I would recommend sticking with ASCII, because the tokenizers will sometimes mangle Unicode.

I have been using nightly builds of torch and a modified version of the git master version of transformers. You may be able to get better performance if you use a torch build compiled with an 11.x version of CUDA, although it depends on your GPU.

Getting DeBERTa to work on your system will likely take some doing at present. In the current master version of transformers, DebertaForMaskedLM doesn't load the MLM head weights from the pretrained model because their names don't line up with how they're implemented in the class. If you try to use it, you will get a warning like this:

Some weights of the model checkpoint at microsoft/deberta-base were not used when initializing DebertaForMaskedLM: ['lm_predictions.lm_head.bias', 'lm_predictions.lm_head.dense.weight', 'lm_predictions.lm_head.dense.bias', 'lm_predictions.lm_head.LayerNorm.weight', 'lm_predictions.lm_head.LayerNorm.bias', 'config', 'deberta.embeddings.position_embeddings.weight']

If you don't fix this problem, the results will be gibberish. The weights are in there, though, and you can load them by applying the following patch to transformers:

diff --git a/src/transformers/modeling_utils.py b/src/transformers/modeling_utils.py index 66875a028..1c4af42dc 100755--- a/src/transformers/modeling_utils.py+++ b/src/transformers/modeling_utils.py@@ -1180,6 +1180,16 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix new_key = key.replace("gamma", "weight") if "beta" in key: new_key = key.replace("beta", "bias")+ if key == "lm_predictions.lm_head.bias":+ new_key = "cls.predictions.bias"+ if key == "lm_predictions.lm_head.dense.weight":+ new_key = "cls.predictions.transform.dense.weight"+ if key == "lm_predictions.lm_head.dense.bias":+ new_key = "cls.predictions.transform.dense.bias"+ if key == "lm_predictions.lm_head.LayerNorm.weight":+ new_key = "cls.predictions.transform.LayerNorm.weight"+ if key == "lm_predictions.lm_head.LayerNorm.bias":+ new_key = "cls.predictions.transform.LayerNorm.bias" if new_key: old_keys.append(key) new_keys.append(new_key)

You will still get a warning about the deberta.embeddings.position_embeddings.weight, but that doesn't seem to be a problem because the model uses the same embeddings for input and output. As far as I can tell, both versions of DeBERTa work properly with this patch, and the big ones seem to produce significantly more coherent output than the other models.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeffbinder/visions-and-revisions/issues/3#issuecomment-831380515, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPLJAMZ7KY3KY3RHRHLTL3GANANCNFSM42JTE4RA .

jeffbinder commented 3 years ago

Usually (on a UNIX-like system) you would just copy the patch into a text file and type patch -p1 <yourpatchfilename.diff while in the base transformers directory. However, it looks like there was a big change to modeling_utils.py just this morning, so that patch no longer lines up with the master version. Here's an updated diff that should work now:

diff --git a/src/transformers/modeling_utils.py b/src/transformers/modeling_utils.py
index 8160b4ba3..d8bd14e0a 100644
--- a/src/transformers/modeling_utils.py
+++ b/src/transformers/modeling_utils.py
@@ -1236,6 +1236,16 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix
                 new_key = key.replace("gamma", "weight")
             if "beta" in key:
                 new_key = key.replace("beta", "bias")
+            if key == "lm_predictions.lm_head.bias":
+                new_key = "cls.predictions.bias"
+            if key == "lm_predictions.lm_head.dense.weight":
+                new_key = "cls.predictions.transform.dense.weight"
+            if key == "lm_predictions.lm_head.dense.bias":
+                new_key = "cls.predictions.transform.dense.bias"
+            if key == "lm_predictions.lm_head.LayerNorm.weight":
+                new_key = "cls.predictions.transform.LayerNorm.weight"
+            if key == "lm_predictions.lm_head.LayerNorm.bias":
+                new_key = "cls.predictions.transform.LayerNorm.bias"
             if new_key:
                 old_keys.append(key)
                 new_keys.append(new_key)

You could also apply the patch manually by copying the new lines (the ones that start with "+") into the appropriate place in the file, making sure to remove the "+" from each line. The -1236,6 +1236,16 means that the affected text starts at line 1236 and goes for 6 lines in the old version and starts at 1236 and goes for 16 lines in the new version. The lines without "+" are present in the original file, so you could find the right spot by doing a word search for them.

GenTxt commented 3 years ago

Thanks for the explanation. Used original as it matches my current transformers. Everything working great with deberta-v2 now

cheers

On Wed, May 5, 2021 at 1:57 PM Jeffrey M. Binder @.***> wrote:

Usually (on a UNIX-like system) you would just copy the patch into a text file and type patch -p1 <yourpatchfilename.diff while in the base transformers directory. However, it looks like there was a big change to modeling_utils.py just this morning, so that patch no longer lines up with the master version. Here's an updated diff that should work now:

diff --git a/src/transformers/modeling_utils.py b/src/transformers/modeling_utils.py index 8160b4ba3..d8bd14e0a 100644--- a/src/transformers/modeling_utils.py+++ b/src/transformers/modeling_utils.py@@ -1236,6 +1236,16 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix new_key = key.replace("gamma", "weight") if "beta" in key: new_key = key.replace("beta", "bias")+ if key == "lm_predictions.lm_head.bias":+ new_key = "cls.predictions.bias"+ if key == "lm_predictions.lm_head.dense.weight":+ new_key = "cls.predictions.transform.dense.weight"+ if key == "lm_predictions.lm_head.dense.bias":+ new_key = "cls.predictions.transform.dense.bias"+ if key == "lm_predictions.lm_head.LayerNorm.weight":+ new_key = "cls.predictions.transform.LayerNorm.weight"+ if key == "lm_predictions.lm_head.LayerNorm.bias":+ new_key = "cls.predictions.transform.LayerNorm.bias" if new_key: old_keys.append(key) new_keys.append(new_key)

You could also apply the patch manually by copying the new lines (the ones that start with "+") into the appropriate place in the file, making sure to remove the "+" from each line. The -1236,6 +1236,16 means that the affected text starts at line 1236 and goes for 6 lines in the old version and starts at 1236 and goes for 16 lines in the new version. The lines without "+" are present in the original file, so you could find the right spot by doing a word search for them.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeffbinder/visions-and-revisions/issues/3#issuecomment-832892447, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPNS7EPA3NKQQGJOEALTMGBIHANCNFSM42JTE4RA .

jeffbinder commented 3 years ago

That's great! I'm closing the issue but feel free to open another one if you have further problems.

GenTxt commented 3 years ago

Hi Jeff:

I forgot to ask if you could recommend a repo or script that can create a similar 'metalness.json' from another text corpus. I've taken a look at the parent repo 'pythonic-metal' but there's a lot of extra processes to wade through in the notebooks and it's not clear which commands apply given the format of lyrics.csv Would like to see results using simple text files from Project Gutenberg sources etc.

Cheers

On Wed, May 5, 2021 at 4:56 PM Aaron Allan @.***> wrote:

Thanks for the explanation. Used original as it matches my current transformers. Everything working great with deberta-v2 now

cheers

On Wed, May 5, 2021 at 1:57 PM Jeffrey M. Binder @.***> wrote:

Usually (on a UNIX-like system) you would just copy the patch into a text file and type patch -p1 <yourpatchfilename.diff while in the base transformers directory. However, it looks like there was a big change to modeling_utils.py just this morning, so that patch no longer lines up with the master version. Here's an updated diff that should work now:

diff --git a/src/transformers/modeling_utils.py b/src/transformers/modeling_utils.py index 8160b4ba3..d8bd14e0a 100644--- a/src/transformers/modeling_utils.py+++ b/src/transformers/modeling_utils.py@@ -1236,6 +1236,16 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix new_key = key.replace("gamma", "weight") if "beta" in key: new_key = key.replace("beta", "bias")+ if key == "lm_predictions.lm_head.bias":+ new_key = "cls.predictions.bias"+ if key == "lm_predictions.lm_head.dense.weight":+ new_key = "cls.predictions.transform.dense.weight"+ if key == "lm_predictions.lm_head.dense.bias":+ new_key = "cls.predictions.transform.dense.bias"+ if key == "lm_predictions.lm_head.LayerNorm.weight":+ new_key = "cls.predictions.transform.LayerNorm.weight"+ if key == "lm_predictions.lm_head.LayerNorm.bias":+ new_key = "cls.predictions.transform.LayerNorm.bias" if new_key: old_keys.append(key) new_keys.append(new_key)

You could also apply the patch manually by copying the new lines (the ones that start with "+") into the appropriate place in the file, making sure to remove the "+" from each line. The -1236,6 +1236,16 means that the affected text starts at line 1236 and goes for 6 lines in the old version and starts at 1236 and goes for 16 lines in the new version. The lines without "+" are present in the original file, so you could find the right spot by doing a word search for them.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeffbinder/visions-and-revisions/issues/3#issuecomment-832892447, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPNS7EPA3NKQQGJOEALTMGBIHANCNFSM42JTE4RA .

jeffbinder commented 3 years ago

I haven't actually tried to generate a json file like that myself. It would make sense to include a script for generating it in this repo, though, since this would be needed to make full use of the modifier parameter. I'll take a crack at it.

jeffbinder commented 3 years ago

I just added a script called generate_modifier.py that can do this. Note that I didn't exclude stopwords, so the results will not be quite the same as the "metalness" file.

To use the generated modifier, you can use the parameter modifier=json_modifier('<your filename>.json'). I also added some options to adjust how the modifier is applied. In the previous version, it assigned a neutral score of 0.0 to words that are not in the JSON file. This is fine if the corpus is large enough to span a large part of the English vocabulary, but I found that, if the corpus is relatively small, the modifier feature ended up having little effect on the output. To fix this, I changed it to assign a score of -10.0 to words that are not in the JSON file. You can adjust this with the default_score parameter.

GenTxt commented 3 years ago

Thanks. New script and generated .json files work perfectly.

Cheers

On Fri, May 7, 2021 at 12:45 PM Jeffrey M. Binder @.***> wrote:

I just added a script called generate_modifier.py that can do this. Note that I didn't exclude stopwords, so the results will not be quite the same as the "metalness" file.

To use the generated modifier, you can use the parameter modifier=json_modifier('<your filename>.json'). I also added some options to adjust how the modifier is applied. In the previous version, it assigned a neutral score of 0.0 to words that are not in the JSON file. This is fine if the corpus is large enough to span a large part of the English vocabulary, but I found that, if the corpus is relatively small, the modifier feature ended up having little effect on the output. To fix this, I changed it to assign a score of -10.0 to words that are not in the JSON file. You can adjust this with the default_score parameter.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeffbinder/visions-and-revisions/issues/3#issuecomment-834612997, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPNPCCNCPFR734KTYHTTMQKLLANCNFSM42JTE4RA .