Nellaker-group / happy

MIT License
9 stars 0 forks source link

model id 2 not update with the gdrive file #14

Open brainfo opened 4 days ago

brainfo commented 4 days ago

Hi, I was testing on the sample wsi.tiff with the example in your third section in the readme. It turned out that the current repo is set up with the base db cell_model id 2: cell_model_accuracy_0.8204.pt; However, in the gdrive, there's only cell_model_accuracy_0.8472.pt Either should this accuracy 0.8472 model be added to the database or we should replace the id 2 model with it. Which one is used in the paper?

Best,

brainfo commented 4 days ago

And in the graphinference.py the hdf5 it trys to locate is $'projects/placenta/results/embeddings/lab{run_id}/{samplename}/run{run_id}.hdf5' However, my results with copying of the cell_inference code from you read me is "projects/placenta/results/embeddings/lab_3/slide_sample_wsi.tif/run_4.hdf5"

I understand that it might be I didn't give the sample wsi an id to be wsi_{runid}, but the lab is with 3 and run is with 4; is this something that can be fixed in the code?

brainfo commented 4 days ago

And for people who are not familiar with qupath like me, a script to load the annotation file in the qupath is valuable.

brainfo commented 4 days ago

And for the nuclei and cell type prediction (annotation), do we also get some resulted files?

The analysis/visualisation/vis_nuclei_preds.py is only for the training dataset?

I don't know where are the output files from the nuclei and cell predict step (while the hdf5 file is obvious)

I see OK

            f.create_dataset("predictions", (total_cells,), dtype="int8")
            f.create_dataset("embeddings", (total_cells, 64), dtype="float32")
            f.create_dataset("confidence", (total_cells,), dtype="float16")
            f.create_dataset("coords", (total_cells, 2), dtype="uint32")

Can you give a map from the predictions integer to the cell type names?

In this order?:

  1. Syncytiotrophoblast
  2. Cytotrophoblast
  3. Vascular Endothelial
  4. Vascular Myocyte
  5. Myocyte
  6. Leukocyte
  7. Hofbauer Cell
  8. Syncytial Knot
  9. Extra Villus Trophoblast
  10. Maternal Decidua
  11. Mesenchymal Cell

And just to confirm, your workflow is that you get the prediction for the tissue type and then subset the embeddings by the tissue types and count the cell types in each tissue type?

Here what I don't understand is that if the tissue type is only with the output annotation tsv X\tY\t\tTissue type (dot-data); then how should I subset the embedding or coordinates? If they are dot-wise but not a range of pixels linking back to the embeddings or coordinates in the hdf5 nuclei prediction.

As now the nuclei prediction hdf5 has 20377 rows (X, Y) and the tissue prediction has 18318 rows.