Open moonblue333 opened 5 years ago
@moonblue333 98.77 test set accuracy on fashion-mnist with a network of a thousand or so parameters does not seem correct. kindly double check a few things:
One common mistake is that people thought they are working with Fashion-MNIST, but they are actually working with MNIST. This mistake can be found repeatedly in #110 #47 #36 #119 #129 etc.
We have submited source (not including core algorithm, but you can see that everything we tested is correct.)
Unfortunately, your code doesn't contain any validation to ensure the data is Fashion-MNIST. By default, input_data.read_data_sets
will download or fall back quietly to standard MNIST so people can easily miss it. Unless you do:
data = input_data.read_data_sets('data/fashion', source_url='http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/')
Or you download the data by yourself and put it there in advance.
I'm sorry to say that, but it's highly likely due to some bug in your code. Please check your code again and again. I'm closing this issue as it is invalid. Please feel free to reopen it if you pass the following tests and show some evidences that you are working with Fashion-MNIST.
Here are some ways to check whether you are working with the correct dataset.
Our benchmark table (contributed by different researchers) also suggests that it's impossible for simple CNN to get 99% test accuracy. FC should be even worse, due to the lack of spatial representation power.
Good luck!
I'm reopening this issue as @moonblue333 and @yuenuting argue that they are using the right dataset. I also mark this issue as "help wanted" and welcome the community to validate their idea.
There is no point to be offended. Tolerating others' questioning is a part of science and should not be taken personally by all means. As the core maintainer of this repo, the least I can do is to argue people's submissions and fences the wrong ones from the benchmark table.
Again, @moonblue333 please do not think questioning your result is out to deliberately hurt you. There is no such drama or discrimination thing as you pictured.
I see you open source the code, good.
You don't need to explain it line by line to me. Please understand your result outperforms all others in the benchmark table in a large margin. With such breakthrough (you mentioned yourself in the repo it's a world record), you have to wait more people (not just me) to reproduce the result and validate your idea.
Therefore, I already marked this issue as "help-wanted" and "contribution-welcome", we just need to wait the community to reproduce the result. After all, this open-source project that relies on the community. Don't worry, people will find it soon.
I've changed the issue title to raise more attention from the community.
To all, the repo that issuer mentioned is https://github.com/yuenuting/incremental-learning-world-record-mnist-fashion
@hanxiao the code is not open sourced as far as i can see... the only thing that is open is the model runner... Please close this issue.
@moonblue333 @yuenuting I do not have time due to my own TODOs to run the code that does not exist as far as I can see. Kindly open a new pull request when you are ready to share your findings to the public since there is no point in putting some number in some table as untested as you mention... all the other benchmarks are linked to verifiable code, which is what we will need from you as well. Also please keep the discussion here to the issue at hand rather than essays justifying the numbers you claim as you are not helping yourself.
hi @moonblue333 @yuenuting , it looks like you are using ROC and AUC as the evaluation metrics.
Note that our benchmark table is based on the mean accuracy. To make a fair comparison with other submissions, please change your get_performance
to the following and report it again:
def get_performance(self, p, y):
p = np.argmax(p, axis=1)
y = np.argmax(y, axis=1)
return np.mean(p == y)
OK.
We tidied mean_accuracy [MEAN_ACCURACY] version, and the final accuracy(10-classes) is 98.*%.
We are tidying and will submit it again ASAP.
def get_performance(self, p, y):
print("$$$$$$$$$ get_performance()")
print('p=',p.shape)
print('y=',y.shape)
print('len(p)=',len(p))
ok = 0
no = 0
for i in range(len(p)):
if np.argmax(p[i]) == np.argmax(y[i]):
ok = ok + 1
else:
no = no + 1
print('p0=',p[i])
print('y0=',y[i])
print('p0-c=',np.argmax(p[i]))
print('y0-c=',np.argmax(y[i]))
print('correct=',(np.argmax(p[i]) == np.argmax(y[i])))
print()
print('ok=',ok)
print('no=',no)
print('acc=',(ok/(ok+no)))
p = np.argmax(p, axis=1)
y = np.argmax(y, axis=1)
return np.mean(p == y)
submited one version(95.41 ~ 97.0), others will be submited asap.
submited another version (96.17 ~ 97.5), others will be submited asap.
@kashif @hanxiao
new better version'SSSSSSSSS' are tidying, on the way, we v them AAA-SSS-AAA-PPP.
2019-01-06: tidied completed, final_voted. (incremental + ensemble + net2net + ... & full-connection small network + few cnn big network)
2019-01-07: we submited the version-4. (maybe this is a final version, if we will submit more powerful version, we will submit the total AGI project, because copying and tidying code from a bigger project is a time-consuming thing.
@kashif @hanxiao we submitted v4, and detial descriptions. (some mini sub-networks use FC-NN, that is OK.)
Please refer to: https://github.com/yuenuting incremental-learning-world-record-mnist-fashion-v1 (as title, and the portal of these projects. ) incremental-learning-world-record-mnist-fashion-v2 (95.41 ~ 97.0) incremental-learning-world-record-mnist-fashion-v3 (96.17 ~ 97.5) incremental-learning-world-record-mnist-fashion-v4 (98.*)
incremental-learning-world-record-mnist-fashion-v (open or not, under discussion ...)
We deleted ideas discussions here, especial some sharp criticisms to doday's AI, such as BP, CNN, total direction, etc. Main ideas please refer to our project README, and directly check code.
Ideas contributor: yuenuting & moonblue333 Algorithm developer: moonblue333