Closed zq1314 closed 6 years ago
@mbbrodie
Sorry for the delay, @zq1314 (out of town). I abandoned this repo nearly a year ago and shifted the topic of my dissertation. Due to the already ill-posed nature of GAN tasks, combining GANS with MCL did not produce promising results. If you're still interested in the area, however, I recommend looking into more recent MCL work, such as this recent 2018 CVPR paper: http://openaccess.thecvf.com/content_cvpr_2018/papers/Firman_DiverseNet_When_One_CVPR_2018_paper.pdf
One of the biggest hindrances to the wider adoption and acceptance of MCL is the need for multiple, expensive-to-train models. DiverseNet and some previous methods (such as TreeNets), provide more efficient methods that allow for more stable training (and require fewer computational resources. Sorry I can't provide a more detailed tutorial. If you're interested in using Caffe for your research, Stefen Lee has a CIFAR-10 MCL classification example here: https://github.com/steflee/MCL_Caffe/tree/master/examples/cifar10 (check out cifar10_quick_train_test.prototxt for the network definition).
Hope that helps and good luck.
@mbbrodie thank you for your help
Sorry to bother you, but their papers have no public code. @mbbrodie ,I feel helpless.
No worries. Looks like Stefan removed the CIFAR-10 example and replaced it with an MNIST LeNet example. Check out the files that begin with 'mcl_' in this directory: https://github.com/steflee/MCL_Caffe/tree/master/examples/mnist
MNIST is a nicer starting point anyway (smaller network and easier to read through the files/see how the pieces work). If you're still struggling after that, it's not hard to reproduce the MCL layer in Python (run k models, identify the the one with the smallest loss, and only update that network). But this MNIST example will hopefully work for you. Good luck
when i see the (mcl_lenet_train_test.prototxt) file,I found that it defines three identical models. but i think it should train three different models. @mbbrodie
That architecture in your diagram is correct. All layers with 1* correspond to the first network, 2 is the second network, and 3_ is the third.
Because they have different names, Caffe treats them as separate layers/networks. At the end of mcl_lenet_train_test.prototxt, you'll see that 1_prob, 2_prob, and 3_prob feed into the MCLMultinomialLogisticLoss layer.
layer { name: "loss" type: "MCLMultinomialLogisticLoss" bottom: "1_prob" bottom: "2_prob" bottom: "3_prob" bottom: "label" top: "multiple-output loss" include { phase: TRAIN } }
That loss layer contains the code for the sMCL algorithm. If you're interested in creating a new loss layer, you can imitate the code in that file.
As a heads up, PyCaffe, PyTorch, or Tensorflow are much easier to code/debug/experiment with new MCL layers (don't have recompile / deal with header files). However, depending on your research goals, Stefan's code might be just fine.
hi @mbbrodie now ,i use https://github.com/chhwang/cmcl code Successfully trained my training set, I want to see the process of error prediction accuracy in tensorboard, but it does not give the training process log in the program. I don't understand TENSORFLOW very well, can you help me look at that code, How to output the training process log?
If you don't understand Tensorflow, go read their online documentation. They have great up-to-date tutorials for beginners, including some on Tensorboard. If you're still struggling, post a question on Stack Overflow or another forum site.
@mbbrodie