google / mentornet

Code for MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks
Apache License 2.0
320 stars 63 forks source link

Question About .csv file #2

Open zevyu opened 5 years ago

zevyu commented 5 years ago

I want to know how the csv file used to train mentor-dd is generated. My understanding is to train the baseline model with a clean tag dataset and use the Corrupted Labels dataset to calculate the loss to get the csv file. Can you tell me the details of generating a csv file?

roadjiang commented 5 years ago

We first train our model for 18 epochs on the noisy dataset. Then we use the model to evaluate on another small dataset, where we have some clean labels. The model will outputs all the feature (on the small dataset) to generate the csv.

zevyu commented 5 years ago

I got it,thanks

ruirui88 commented 5 years ago

We first train our model for 18 epochs on the noisy dataset. Then we use the model to evaluate on another small dataset, where we have some clean labels. The model will outputs all the feature (on the small dataset) to generate the csv.

Hi,I want to make sure how the csv file is generated.You said that it pre-trained model on the nosiy dataset firstly, and then evaluate the model on the small dataset(whose size is 10 percents?). So ,the clean labels in the csv file is the true labels of clean data, while the noisy labels is the prediction of the model?Is right?

roadjiang commented 5 years ago

Details are in https://github.com/google/mentornet/blob/master/TRAINING.md

clean label column: ground-truth labels on small clean dataset noisy label column: given labels on the current noisy dataset loss column: loss computed using the noisy label

ruirui88 commented 5 years ago

Details are in https://github.com/google/mentornet/blob/master/TRAINING.md

clean label column: ground-truth labels on small clean dataset noisy label column: given labels on the current noisy dataset loss column: loss computed using the noisy label

Sorry ,i don't quite get it. Whether if evaluating the pre-trained model on the clean and noisy dataset together? The samples whose ground-truth label and noisy label is the same comes from clean dataset, while the others come from noisy dataset. What's more, how does calculate the value in the clean label column for this noisy dataset. Is it manually annotated or prediciton of pre-trained model?

wffzxyl commented 5 years ago

Could you upload the files or code about the function 'provide_resnet_noisy_data' for extract resnet features in the cifa_eval.py(line 186)?

wffzxyl commented 5 years ago

Details are in https://github.com/google/mentornet/blob/master/TRAINING.md clean label column: ground-truth labels on small clean dataset noisy label column: given labels on the current noisy dataset loss column: loss computed using the noisy label

Sorry ,i don't quite get it. Whether if evaluating the pre-trained model on the clean and noisy dataset together? The samples whose ground-truth label and noisy label is the same comes from clean dataset, while the others come from noisy dataset. What's more, how does calculate the value in the clean label column for this noisy dataset. Is it manually annotated or prediciton of pre-trained model?

Have you finished the generation of the csv files? could you give me the csv file generation code. I can't found it in these files

roadjiang commented 5 years ago

The authors are supposed to generate their own csv files from their models.

On Mon, May 13, 2019, 2:27 AM wffzxyl notifications@github.com wrote:

Details are in https://github.com/google/mentornet/blob/master/TRAINING.md clean label column: ground-truth labels on small clean dataset noisy label column: given labels on the current noisy dataset loss column: loss computed using the noisy label

Sorry ,i don't quite get it. Whether if evaluating the pre-trained model on the clean and noisy dataset together? The samples whose ground-truth label and noisy label is the same comes from clean dataset, while the others come from noisy dataset. What's more, how does calculate the value in the clean label column for this noisy dataset. Is it manually annotated or prediciton of pre-trained model?

Have you finished the generation of the csv files? could you give me the csv file generation code. I can't found it in these files

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/mentornet/issues/2#issuecomment-491747708, or mute the thread https://github.com/notifications/unsubscribe-auth/ADGNQMO5KZDE5UDQCTNUFT3PVEYBJANCNFSM4GJACEOA .

AnnPe commented 4 years ago

We first train our model for 18 epochs on the noisy dataset. Then we use the model to evaluate on another small dataset, where we have some clean labels. The model will outputs all the feature (on the small dataset) to generate the csv.

Hi,I want to make sure how the csv file is generated.You said that it pre-trained model on the nosiy dataset firstly, and then evaluate the model on the small dataset(whose size is 10 percents?). So ,the clean labels in the csv file is the true labels of clean data, while the noisy labels is the prediction of the model?Is right?

Hi @ruirui88 , did you manage to create your csv file?

AnnPe commented 4 years ago

Details are in https://github.com/google/mentornet/blob/master/TRAINING.md clean label column: ground-truth labels on small clean dataset noisy label column: given labels on the current noisy dataset loss column: loss computed using the noisy label

Sorry ,i don't quite get it. Whether if evaluating the pre-trained model on the clean and noisy dataset together? The samples whose ground-truth label and noisy label is the same comes from clean dataset, while the others come from noisy dataset. What's more, how does calculate the value in the clean label column for this noisy dataset. Is it manually annotated or prediciton of pre-trained model?

Have you finished the generation of the csv files? could you give me the csv file generation code. I can't found it in these files

Hi @wffzxyl , did you manage to generate the csv file? I can not reproduce the authors' results, so im afraid Im doing all the wrong way round