EmilHvitfeldt / textdata

Download, parse, store, and load text datasets instead of storing it in packages
https://emilhvitfeldt.github.io/textdata/
Other
75 stars 13 forks source link

lexicon_nrc() broken due to structural changes in source ZIP archive #50

Closed grantdick closed 2 years ago

grantdick commented 2 years ago

lexicon_nrc() fails to run due to missing file:

It appears that there has been a change to the structure of the file downloaded from: http://saifmohammad.com/WebDocs/Lexicons/NRC-Emotion-Lexicon.zip (according to http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm this file was updated in August 2022)

Seems like the path that is currently specified in process_nrc(): "NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-v0.92/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt should actually be: "NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt

EmilHvitfeldt commented 2 years ago

Thank you for reporting! This has been fixed in https://github.com/EmilHvitfeldt/textdata/commit/d41c432aa7d0f9c94262e12151e0589e94369ad9