seg / 2016-ml-contest

Machine learning contest - October 2016 TLE
Apache License 2.0
188 stars 269 forks source link

The magic number 0.43 #8

Closed thanish closed 8 years ago

thanish commented 8 years ago

The prediction was done on which testing dataset to reach an accuracy of 0.43 ? I mean, was the prediction done on the blind data set with SHANKLE well or NEWBY well or any other data?

kwinkunks commented 8 years ago

Please use the printed article as the intended view of performance, not the notebooks. So the blind well is SHANKLE and the F1 score is 0.43. I will try to make the notebooks consistent today.

Notwithstanding this, if we are able to change the 'true' blind test with new data (see #2 — we are still waiting to resolve this) then of course the performance of Brendon's model will change.

kwinkunks commented 8 years ago

Reopening because, while b9f2b15fcf4b56bed54e5b69246f431df497426b has resolve the mismatch between the article nd the notebook, I cannot reproduce the F1 score in the article.

Right now, it's 0.39 (cf 0.43 in the article). Hoping I just missed something when I merged the final version of the notebook. Maybe @brendonhall can check the notebook and make sure it's the same as his. I'm thinking the main differences could be the test/train random seed, and the values of the hyperparameters c and gamma.

LukasMosser commented 8 years ago

See this closed issue by @CannedGeo #10 Training_data.csv?

brendonhall commented 8 years ago

Hi everyone, sorry for being late to the party on this. The problem is that the main notebook and the article are a little out of sync. I streamlined the narrative of the notebook quite a bit for the article. One of the things I did was create a dataset for the article that only had a complete set of well logs. facies_vectors.csv is the original training set from the website I obtained the data from. I removed the vectors that don't have a PE value, and saved that dataset has training_data.csv. That is what I used for the article, and I have changed the notebook to match.

Second, in the article I used test_size=0.1 when splitting the training and cross validation sets. The notebook had a value of 0.2, so I have changed this to match.

After these changes, I'm getting an overall F1 of 0.42 for the classifier. Not quite what is reported in the article. When I run the pure article code again I'm also getting 0.42. Perhaps there has been some change in the ordering/randomization of the data? Maybe some libraries have been updated? It's Halloween???

kwinkunks commented 8 years ago

Awesome, thanks @brendonhall.

As far as I'm concerned, 0.42 is the same as 0.43. I am very confident that someone will get a better fit than 0.43 soon anyway, if they haven't already. So if this isn't a non-issue already, I think it will be soon.

@thanish Thank you again for raising this. I'll close it now. The notebook is a good reflection of the 'base case'. We can of course re-open if need be, I trust you will let me know. Cheers!