dennybritz / cnn-text-classification-tf

Convolutional Neural Network for Text Classification in Tensorflow
Apache License 2.0
5.64k stars 2.77k forks source link

I want to use cross-validation to train and predict all my own data, how to use eval.py? #160

Closed jmfu95 closed 6 years ago

jmfu95 commented 6 years ago

My data has no test dataset. I want to use cross-validation to train and predict my data.Should I use my data to train, then use eval.py to predict? If not, how can I get predicted label for all data? Someone knows? Thanks

Soonmok commented 6 years ago

you need to split your data into train_data and test_data you can use scikit-learn library to split it The function from library you need is train_test_split

Here is document http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

Soonmok commented 6 years ago

I just create pull request which uses cross-validation, and you can check it. (but its not for testing, its for training.)

jmfu95 commented 6 years ago

Thanks for your reply. You mean I should use this function to split train and test then use eval.py, but I train the data in train.py. so if I use pre-trained model in this dataset, then I use this data to evaluate, it maybe get too high accuracy. Is that true?

kaiwendt92 commented 6 years ago

Which pre-trained model you want to use ? A model that is not based on your own data ?

This is not a good idea, since you can't rely on the outcome.

Just take your data. Use the function from scikit learn to split your data into train and test data (80/20, 70/30 depending on how much you have) and then use this model in eval.py


Von: jmfuuu notifications@github.com Gesendet: Mittwoch, 25. Juli 2018 11:11 An: dennybritz/cnn-text-classification-tf Cc: Subscribed Betreff: Re: [dennybritz/cnn-text-classification-tf] I want to use cross-validation to train and predict all my own data, how to use eval.py? (#160)

Thanks for your reply. You mean I should use this function to split train and test then use eval.py, but I train the data in train.py. so if I use pre-trained model in this dataset, then I use this data to evaluate, it maybe get too high accuracy. Is that true?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdennybritz%2Fcnn-text-classification-tf%2Fissues%2F160%23issuecomment-407688251&data=02%7C01%7C%7C476b578d3997458d1e4a08d5f20e9245%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636681066744496381&sdata=Vs4M%2FpAf5fXjyl%2B2mvzACr0GoOiHbB0hs6k8N8JhbZQ%3D&reserved=0, or mute the threadhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAg2oaXRhkMqG67w5-t3lARe0Dp2dwhTeks5uKDYxgaJpZM4Vfkqe&data=02%7C01%7C%7C476b578d3997458d1e4a08d5f20e9245%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636681066744496381&sdata=Hh2%2FL9sWQ69K28mPjQtsLhMRyZn8af6iF%2Bnqb2hZTT8%3D&reserved=0.

jmfu95 commented 6 years ago

OK. I understand. Thank you all.