Problem with respect to running the code.

d12306 commented 5 years ago

Hi, @haldai , sorry for bothering again. There are several confusing questions.

Will running "test_go_M(10, Model, [200, 0.4, 0, 100], stroke)." be able to generate the final dictionary? Then how can I use the dict to code the training and testing images and then train a SVM? Could you please spend some time to describe the entire pipeline of code usage?
The data MINST_train.csv seems to be corrupted.
If I wanna use 100 images per class for mnist images for training, should I directly call the function"test_go_M1(10, Model, [200, 0.4, 0, 100], stroke)." In addition, so the 200 in [200, 0.4, 0, 100] means the dictionary has 200 dimensions? But how about the 100?

Thanks, looking forward to your reply.

Xuefeng Du from XJTU.

haldai commented 5 years ago

Hi, sorry for the mess of our code 😂

Will running "test_go_M(10, Model, [200, 0.4, 0, 100], stroke)." be able to generate the final dictionary? Then how can I use the dict to code the training and testing images and then train a SVM? Could you please spend some time to describe the entire pipeline of code usage?

Documentation of parameters will be added later, but I'm afraid that I don't have enough time at the moment...The test_learn/9 predicate in test_ol.pl and test.pl describes the learning process. The learned dictionary will be output to result/ directory, which can be loaded and used for encoding the data. The test_decode/4 predicate in test.pl shows an example of this kind of encodings (sorry for the confusing predicate name though...), the decode_ol.pl shows how to call the predicate.

The data MINST_train.csv seems to be corrupted.

I just checked and it is fine, maybe you want to re-download it, or simply generate csv files from the original mnist dataset.

If I wanna use 100 images per class for mnist images for training, should I directly call the function"test_go_M1(10, Model, [200, 0.4, 0, 100], stroke)." In addition, so the 200 in [200, 0.4, 0, 100] means the dictionary has 200 dimensions? But how about the 100?

The first parameter 200 is the dict size, the last 100 is the limit of turns of sparse coding. If you want to use 100 images per class for training, you should sample the mnist data and construct your own mnist_train.csv. In fact, there is a MNIST_100.csv which was randomly sampled in this way.

If you need more instructions, please feel free to contact me or @KarlFreecss , he recently adapted LASIN to another dataset and optimised the abduction process, so he may have more experiences on using this code.

d12306 commented 5 years ago

Thanks for your description. Say if I desire to use the code for training on the MINST_100 dataset and testing on the testing set. So please tell me what I am doing now is right or not.

@haldai, @KarlFreecss .

Firstly. I run the "test_go_M1(10, Model, [200, 0.4, 0, 100], stroke)." in the test.pl and then I get the latest dictionary dict_1.csv(actually I got 10 dicts, so the "10" here is in the function is used to name the output dict, no other use, right? )

Then I run " test_decode('/address/to/my/latest/dictionary/dict_1.csv', [200, 0.4, 0, 100], '../../data/MNIST_test.csv', '/address/to/my/output/file/MNIST_test_code.csv')."

Will that be OK?

Will the generated code of the testing images be in the same order as it is in the MNIST_test.csv file? Since I am gonna use it to train the SVM.

haldai commented 5 years ago

Firstly. I run the "test_go_M1(10, Model, [200, 0.4, 0, 100], stroke)." in the test.pl and then I get the latest dictionary dict_1.csv(actually I got 10 dicts, so the "10" here is in the function is used to name the output dict, no other use, right? )

The 10 here is the limit of turns of LASIN, the 10 dictionaries are the output of each iteration.

Then I run " test_decode('/address/to/my/latest/dictionary/dict_1.csv', [200, 0.4, 0, 100], '../../data/MNIST_test.csv', '/address/to/my/output/file/MNIST_test_code.csv')."

Will that be OK?

Will the generated code of the testing images be in the same order as it is in the MNIST_test.csv file? Since I am gonna use it to train the SVM.

This is okay, the order will be the same. You can plot them with Matlab or any tools you like. :)

d12306 commented 5 years ago

Thanks for your patience, @haldai , but I found that dict_10.csv is always better than dict_1.csv. but dict_1.csv is undergoing more iterations. I am wondering how can it be worse than the dictionary which is obtained prior to it?

haldai commented 5 years ago

dict_10.csv should be the diction output by the 10th iteration.

d12306 commented 5 years ago

Thank you so much for helping me on my project.

haldai commented 5 years ago

Not at all, please feel free to let us know if you have any further question 😃

haldai / LASIN

Problem with respect to running the code. #3