Closed DarioS closed 1 year ago
This is covered in the SingleR book. http://bioconductor.org/books/release/SingleRBook/using-multiple-references.html
This particular issue is also true of any reference-based method using author annotations, which is made quite clear throughout the book.
If you want to really get in the weeds, pick any mouse brain atlas and compare it to any other similar brain atlas. They'll label certain populations completely differently from each other even if they pull out similar markers.
They will not align perfectly
On Sat, Mar 11, 2023, 5:00 PM Dario Strbenac @.***> wrote:
I think there should be more emphasis on labelling stability in the vignette. For example, I have been investigating a publication by Regeneron Pharmaceuticals titled Immunostimulatory Cancer-Associated Fibroblast Subpopulations Can Predict Immunotherapy Response in Head and Neck Cancer https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9161438/ in Clinical Cancer Research. In the main text, only BLUEPRINT is mentioned.
But, looking at the RDS file https://data.mendeley.com/datasets/yk8wj7xgdg/1 (the one named hnscc.gene.expression.integrated.rds), they also tried HPCA. The main labels almost all change between the two attempts. So, it is not clear if the biological conclusions are valid or why BLUEPRINT was chosen and not HPCA (swept under the rug and reviewers probably didn't even try downloading RDS file to reproduce any results of the journal article). [image: image] https://user-images.githubusercontent.com/631218/224514147-46391114-4bad-45a6-a2bf-bfd46c268dff.png
Annotation Diagnostics section of vignette uses just one reference. It would be nice to see the concordance (or lack of it) between different altases with mostly same cell types demonstrated. It might at least help to make more peer reivewers aware of the issue.
— Reply to this email directly, view it on GitHub https://github.com/LTLA/SingleR/issues/236, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQNGES3LU4PANMXQ7FBTW3T7X3ANCNFSM6AAAAAAVXXVMZY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Doh! Thanks. I'll read the section titled Comparing Scores Across References.
I think there should be more emphasis on labelling stability in the vignette. For example, I have been investigating a publication by Regeneron Pharmaceuticals titled Immunostimulatory Cancer-Associated Fibroblast Subpopulations Can Predict Immunotherapy Response in Head and Neck Cancer in Clinical Cancer Research. In the main text, only BLUEPRINT is mentioned.
But, looking at the RDS file (the one named hnscc.gene.expression.integrated.rds), they also tried HPCA. The main labels almost all change between the two attempts. So, it is not clear if the biological conclusions are valid or why BLUEPRINT was chosen and not HPCA (swept under the rug and reviewers probably didn't even try downloading RDS file to reproduce any results of the journal article).
Annotation Diagnostics section of vignette uses just one reference. It would be nice to see the concordance (or lack of it) between different altases with mostly same cell types demonstrated. It might at least help to make more peer reivewers aware of the issue.