Open lindsaydbrin opened 1 year ago
Hi @lindsaydbrin - thank you so much for the detailed error message, this is super explanatory and helpful. The flair versioning has been a constant headache for us, but what you've described seems super straightforward to fix. Hopefully someone can help us.
Hi @jxmorris12 - No problem, happy to help! What do you mean that hopefully someone can help us? I think with the textattack
code changes I described, it should fix the problem(s).
I could do it, I just don't have the environment set up, so I thought it might be easy for someone (you or otherwise) to make the change quickly. But if that's not easy, I don't mind. (Or am I missing something?)
oh yes, I meant if someone puts up a pull request! I will have time eventually but likely can't make the changes this week.
As for the second issue proposed in https://github.com/QData/TextAttack/issues/713#issue-1574652467,
I replace self._enptb_to_universal
with an array with the relevant universal POS tags, in the \textattack\transformations\word_swaps\word_swap_inflections.py
file, as the following code.
self._enptb_to_universal = {
#-----dictionary with new tags---------
"PUNCT": ".",
"CCONJ": "CONJ",
"SCONJ": "CONJ",
"PROPN": "NOUN",
"PART": "PRT",
"AUX": "VERB",
"SYM": "NOUN",
"INTJ":"X",
#----original dictionary below----------
"JJRJR": "ADJ",
"VBN": "VERB",
"VBP": "VERB",
"JJ": "ADJ",
"VBZ": "VERB",
"VBG": "VERB",
"NN": "NOUN",
"VBD": "VERB",
"NP": "NOUN",
"NNP": "NOUN",
"VB": "VERB",
"NNS": "NOUN",
"VP": "VERB",
"TO": "VERB",
"MD": "VERB",
"NNPS": "NOUN",
"JJS": "ADJ",
"JJR": "ADJ",
"RB": "ADJ",
}
(Mapping info: https://github.com/slavpetrov/universal-pos-tags and https://zhuanlan.zhihu.com/p/427520069.)
Describe the bug
I ran into two
flair
-related issues while using the Word Swap by Inflections transformation. The first one required aflair
update, and the second required a smalltextattack
code change.I'd like to suggest requiring a newer version of
flair
, as well as a small code change. Alternatively, if a specific older version offlair
is preferred and works with thetextattack
code as is, specifying that version would also solve the problem.First issue
On line 220 of
textattack/shared/utils/strings.py
,textattack
calls thepredict()
method offlair
'sSequenceTagger
and passes the argumentforce_token_predictions=True
. I happened to be using an older version offlair
(0.6) for which thepredict
method in question does not have the parameterforce_token_prediction
, which means I got:Updating to the current version (0.11) addressed the issue.
Second issue:
Once the above was solved, the inflections word swap did not return any transformations, only the original input. Tracing the problem, I noticed that
WordSwapInflections._get_replacement_words()
gets part of speech (POS) fromflair
and checks it againstWordSwapInflections._enptb_to_universal
, which converts POS tags from the Penn Treebank to universal POS tags. However,flair
(v. 0.11) seems to be returning universal POS tags, so none of the words' POS tags were ever found in the list of those to transform, so no transformations were made. (See PyCharm screenshot below.)As a quick solution, I (locally) changed line 57 to reference the dictionary's values (instead of keys), and line 68 to
lemminflect_pos = word_part_of_speech
directly. A better solution would probably be to replaceself._enptb_to_universal
with an array with the relevant universal POS tags, and reference that directly.This screenshot is from mid-debugging a test that calls the inflection perturbation. You can see that
word_part_of_speech
is"NOUN"
, which is in the values, not keys, ofself._enptb_to_universal
. (And, in fact, changing the code to reference the dictionary values addressed the issues and made the test pass.)To Reproduce
You can reproduce the first issue by running code or by checking the code directly.
So either:
flair
: 0.6~ OR ~
flair
version 0.6.2, specificallyflair/models/sequence_tagger_model.py
lines 299-308, which do not include the parameterforce_token_predictions
.flair
version 0.11, specifically lines 427-437, which include the parameterforce_token_predictions
.For the second issue, presuming
flair
is up to date:Expected behavior
With the above code, I expect the result to be several transformations (up to the number requested) with inflectional perturbations.
Note that I was able to get this once I updated
flair
and changed the two lines of code referenced in the Describe the Bug section above.System Information (please complete the following information):
datasets==2.4.0
;transformers==4.21.0
;flair==0.6
and upgraded toflair==11.3
as described.0.3.8
Additional context
Let me know if you want me to code this out and do a PR vs. whether someone else can do the fix easily (i.e. has the environment already set up)!
Also, this is what I was using as a test: