Closed jmfu95 closed 6 years ago
you need to split your data into train_data and test_data you can use scikit-learn library to split it The function from library you need is train_test_split
Here is document http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
I just create pull request which uses cross-validation, and you can check it. (but its not for testing, its for training.)
Thanks for your reply. You mean I should use this function to split train and test then use eval.py, but I train the data in train.py. so if I use pre-trained model in this dataset, then I use this data to evaluate, it maybe get too high accuracy. Is that true?
Which pre-trained model you want to use ? A model that is not based on your own data ?
This is not a good idea, since you can't rely on the outcome.
Just take your data. Use the function from scikit learn to split your data into train and test data (80/20, 70/30 depending on how much you have) and then use this model in eval.py
Von: jmfuuu notifications@github.com Gesendet: Mittwoch, 25. Juli 2018 11:11 An: dennybritz/cnn-text-classification-tf Cc: Subscribed Betreff: Re: [dennybritz/cnn-text-classification-tf] I want to use cross-validation to train and predict all my own data, how to use eval.py? (#160)
Thanks for your reply. You mean I should use this function to split train and test then use eval.py, but I train the data in train.py. so if I use pre-trained model in this dataset, then I use this data to evaluate, it maybe get too high accuracy. Is that true?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdennybritz%2Fcnn-text-classification-tf%2Fissues%2F160%23issuecomment-407688251&data=02%7C01%7C%7C476b578d3997458d1e4a08d5f20e9245%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636681066744496381&sdata=Vs4M%2FpAf5fXjyl%2B2mvzACr0GoOiHbB0hs6k8N8JhbZQ%3D&reserved=0, or mute the threadhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAg2oaXRhkMqG67w5-t3lARe0Dp2dwhTeks5uKDYxgaJpZM4Vfkqe&data=02%7C01%7C%7C476b578d3997458d1e4a08d5f20e9245%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636681066744496381&sdata=Hh2%2FL9sWQ69K28mPjQtsLhMRyZn8af6iF%2Bnqb2hZTT8%3D&reserved=0.
OK. I understand. Thank you all.
My data has no test dataset. I want to use cross-validation to train and predict my data.Should I use my data to train, then use eval.py to predict? If not, how can I get predicted label for all data? Someone knows? Thanks