LuminosoInsight / sales-engineering-code

Code for sales engineering, particularly for code that will be given to customers
MIT License
0 stars 0 forks source link

Twitter cleaner #7

Closed lumitim closed 8 years ago

lumitim commented 9 years ago

Aaaaaand already messed up with git.

Tahnan commented 9 years ago

A general comment, because it's easier to leave it here than on any sort of line-by-line basis: it's a good idea to find a PEP8 checker and run files through it. (Disclosure: I'm still trying to get mine working as well as I want it to.)

In this case, doing so reveals whitespace at the end of a number of lines or in otherwise blank lines. It also catches a couple of meaningless style points which nevertheless are kind of nice to fix, for the sake of keeping code consistent and readable: missing spaces in i + 1 in line 159, underindentation in line 110, overindentation in line 79. (In that last case, I'd recommend putting parentheses around the two-part "and" condition, which will make the indenting correct, and will somewhat improve the readability, IMHO.)

(flymake, an on-the-fly syntax checker, also reveals an unused import on line 3.)

alin-luminoso commented 9 years ago

This whole operation would probably run orders of magnitude faster if it just used lumi_science to manually run the parts of the pipeline that it actually cares about (tokenization and collocation-finding). It might not even want to tokenize; I wouldn't be surprised if it gets better results using the SpaceSplittingReader or the like, since spam is probably susceptible to that.

Just leaving this as food for thought...

lumitim commented 8 years ago

Closing as obsolete due to code from Tim O that Dan already has elsewhere.