cbaziotis / ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
MIT License
661 stars 90 forks source link

Word statistics not found.....How can I solve this error? #29

Closed masterbo98 closed 2 years ago

masterbo98 commented 3 years ago

I have uncompressed the zip you offered and put it into my home dir, and the follwing is how my category looks like. However, when I attempt to use SpellCorrector, it still reminds me that word statistics not found. image

fucaja commented 3 years ago

install the library using

!pip install git+https://github.com/fucaja/ekphrasis.git

UGUESS-lzx commented 2 years ago

Excuse me, where can I get the twitter_2018 file?😢

cbaziotis commented 2 years ago

Initially, I used my personal dropbox account to host the file as only some friends and I were using the library. It turns out that dropbox has suspended my public links for generating excessive traffic...

I moved the data to another server and updated the public link for the stats.zip file. Please, ppdate the package and try again.

build from source

pip install git+git://github.com/cbaziotis/ekphrasis.git

or install from pypi

pip install ekphrasis -U

FYI the link is https://data.statmt.org/cbaziotis/projects/ekphrasis/stats.zip

Let me know if it works now.