biocore / taxster

taxster: assigning taxonomy to organisms you've never even heard of
BSD 3-Clause "New" or "Revised" License
2 stars 4 forks source link

Pacakge bits #5

Open wasade opened 10 years ago

wasade commented 10 years ago

Merged in @audy's pass, and put some structure in place for the code. The api/cli distinction feels nice here, though I think there is more room to decompose... was using this as an opportunity to play with some ideas prior to dropping click into qiime

Note, travis and coveragerc are included but no unit tests have been written and haven't hooked up services yet

@audy, using the HashingVectorizer is awesome. I'm not seeing the memory spike you mentioned, but instead seeing a runtime issue that is driven by the serialization (profiling show'd stupid amounts of time being spent in ravel in _mul_multivector in scipy's compressed.py or something. The notebook was a few orders of magnitude faster by doing the kword transform external, but I agree with you that this is likely a bug.

Last, two things, the resulting classifications aren't great yet but we're near in a place to really start poking and proding and this is currently dependent on biocore/scikit-bio#643

gregcaporaso commented 9 years ago

@wasade, looks great. All very minor comments, feel free to ignore any for now.

wasade commented 9 years ago

Only slightly kind of totally missing unit tests...

@audy, did you hear back from scikit-learn about the performance issues by chance?

gregcaporaso commented 9 years ago

Ready for re-review?

wasade commented 9 years ago

Yup, unless we should get specific tests in at this time... On Sep 18, 2014 5:10 PM, "Greg Caporaso" notifications@github.com wrote:

Ready for re-review?

— Reply to this email directly or view it on GitHub https://github.com/biocore/taxster/pull/5#issuecomment-56103737.

gregcaporaso commented 9 years ago

What do you think - should we get the code in and then start the tests, or hold off until the tests are in?

On Thu, Sep 18, 2014 at 2:12 PM, Daniel McDonald notifications@github.com wrote:

Yup, unless we should get specific tests in at this time... On Sep 18, 2014 5:10 PM, "Greg Caporaso" notifications@github.com wrote:

Ready for re-review?

Reply to this email directly or view it on GitHub https://github.com/biocore/taxster/pull/5#issuecomment-56103737.

Reply to this email directly or view it on GitHub https://github.com/biocore/taxster/pull/5#issuecomment-56104016.

gregcaporaso commented 9 years ago

Made one other suggestion @wasade.