Closed tikboaHIT closed 3 years ago
Hi,
what version of the BRACS dataset of are using?
I assume 566 samples is on the new version, which we haven't used for this work. Are you able to reproduce w/ the previous version?
I run the pre-trained model on the previous_version data with bracs_hact_7_classes_pna.yml,
but got similar results.
The results of the model using hact are as follows:
I don't know which phase went wrong.
Maybe something to do with the preprocessing. You can download the preprocessed cell, tissue and hact graphs for the BRACS dataset here:
https://ibm.box.com/s/6v4sasavltjzi91gmohswz2ek6i8i3lp
Or by downloading this zip file that includes the test cell graphs:
https://ibm.box.com/shared/static/412lfz992djt8u6bgu13y9cj9qsurwui.zip
Let me know if you can reproduce with these ones
I downloaded the preprocess cell, tissue and hact graphs, and it works. Thank you. I will check preprocess files later. The results of the model using hact are as follows:
By the way, I see you divided BRACS dataset into four test folds. How did you divide them and why? Are they partitioned randomly?
I can't get correct results with the cell graphs files in the second link.
.../python3.7/site-packages/sklearn/metrics/_classification.py:1248: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use zero_division parameter to control this behavior.
But I can get the correct cggnn results using pre-processed files you provided in the first link:
When pre-processing images, some will cause a warning like this while the others won't: I checked the original images and they're not empty pictures like #2 mentioned. Will it disturb the pre-process? Thank you for your help! :)
The warnings should not be an issue to run the preprocessing.
Hello, may I ask where can I download pre-trained models?
@guillaumejaume Would you know why a model trained on graphs preprocessed and created locally don't provide the same performance as the graphs you provide in the download link?
I trained my own HACTNet, cell, and tissue graph models to see if it's cell or tissue graphs that's causing the performance gap. Based on my results, it appears that it's mainly the tissue graphs generated by this repo under stock settings that aren't as good for model performance as those uploaded by the histocartography team. Maybe there's something about ColorMergedSuperpixelExtractor
or RAGGraphBuilder
that different between the paper code and what I ran? I used histocartography version 0.2.0 for this.
Here are my weighted F1 scores on the test set for models trained on the IBM Box graph sets compared to those I created locally from the BRACS ROI previous version using generate_hact_graphs.py
and the pretrained checkpoint scores provided in the README:
Model | README | Trained on uploaded | Trained on locally generated |
---|---|---|---|
CG Model | 56.7 | 57.5 | 56.7 |
TG Model | 57.8 | 57.1 | 53.8 |
HACTNet | 61.5 | 61.4 | 56.5 |
EDIT: Noticed that my script failed to create the last two dozen training graphs because of a corrupted RoI download. I finished creating my graphs, retrained the models, and updated my findings.
I trained my own HACTNet, cell, and tissue graph models to see if it's cell or tissue graphs that's causing the performance gap. Based on my results, it appears that it's mainly the tissue graphs generated by this repo under stock settings that aren't as good for model performance as those uploaded by the histocartography team. Maybe there's something about
ColorMergedSuperpixelExtractor
orRAGGraphBuilder
that different between the paper code and what I ran? I used histocartography version 0.2.0 for this.Here are my weighted F1 scores on the test set for models trained on the IBM Box graph sets compared to those I created locally from the BRACS ROI previous version using
generate_hact_graphs.py
and the pretrained checkpoint scores provided in the README:Model README Trained on uploaded Trained on locally generated CG Model 56.7 57.5 56.7 TG Model 57.8 57.1 53.8 HACTNet 61.5 61.4 56.5 EDIT: Noticed that my script failed to create the last two dozen training graphs because of a corrupted RoI download. I finished creating my graphs, retrained the models, and updated my findings.
Hi, I'm not able to reproduce the "Trained on uploaded" results, my results remain around 55, did you do hyperparameter tuning or simply follow the settings (learning rate, epochs, batch size) that are provided in README?
I didn't do any hyperparamter tuning, I just use the config and settings as shown. I've noticed that test set accuracy tends to fluctuate up and down a few percentage points even with the same settings, so I think 55 is close enough.
hello:
When using your pre-trained model, I found that your accuracy and F1 score cannot be achieved. The results of the model using hact are as follows:
The results of the model using cggnn are as follows:
The results of these experiments are quite different from yours. I'm confused and don't know which phase went wrong.