deeplearning4j / deeplearning4j-docs

Documentation for Deeplearning4j - Deep Learning for the JVM, Java & Scala
Apache License 2.0
54 stars 43 forks source link

Text classification tutorial on Reuters dataset #37

Open crockpotveggies opened 6 years ago

crockpotveggies commented 6 years ago

Due Date

To be completed by: 2018-05-30

Description

Use the Reuters news dataset to create a tutorial for text classification.

Assignees

Please ensure you have assigned at least one person to this issue. Include any authors and reviewers required.

turambar commented 6 years ago

We should put some effort into finding a more relevant use case and data set.

For example, how about sentiment classification?

SNAP appears to have a bunch of cool data: http://snap.stanford.edu/data/

There's a few cool things on UCI: https://archive.ics.uci.edu/ml/datasets.html?format=&task=&att=&area=&numAtt=&numIns=&type=text&sort=nameUp&view=table

crockpotveggies commented 6 years ago

@turambar I trust your opinion, what do you recommend of all these datasets/use cases? Pick one and let's go with it.

turambar commented 6 years ago

Given the latest rumors about UIPath, I'd say we should find something that is as close as possible to the sort of text classification use cases we'll be working on for them.

RobAltena commented 6 years ago

I have written a java iterator for the Reuters dataset: https://github.com/AltA-Advisory/ReutersParser

Hopefully makes our lives a little easier.

turambar commented 6 years ago

Thanks @RobAltena: if we proceed with Reuters, we'll check it out!