rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.06k stars 525 forks source link

[FEA] Distributed CountVectorizer #3172

Open VibhuJawa opened 3 years ago

VibhuJawa commented 3 years ago

Is your feature request related to a problem? Please describe.

Now that we have a distributed hashing vectorizer we should also have a distributed count vectorizer. This is especially useful for cases when we want weights associated with each vector/word especially when tied up our distributed tf-idf vectorizer.

Describe the solution you'd like A distributed multi gpu version of our count vectorizer

VibhuJawa commented 3 years ago

Happy to take this on.

github-actions[bot] commented 3 years ago

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

github-actions[bot] commented 3 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.