cbaziotis / ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
MIT License
660 stars 91 forks source link

Log messages print to stdout #2

Closed ckingdev closed 5 years ago

ckingdev commented 6 years ago

First just wanted to say thanks for making this available, this has been very useful- I've been working with reddit and Twitter text and this has been instrumental.

This is altogether a pretty minor issue, but there are a few places where messages are printed. I use this in a script along with GNU parallel to process line delimited json, and merge the results back into one file. So the output is mixed with the messages about loading the models, etc.

I forked this and replaced the print statements with logging calls, so that they'll go to stderr and the user can control the verbosity by setting the logging level. Would you be interested in a pull request for that? I would just need to clean up a few changes I made to keep subreddit names together (just added a regex in the pipeline). I'd be happy to share that too.

Thanks!

cbaziotis commented 6 years ago

Hi @ckingdev, sure i would gladly accept a PR.