If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.
292
Step 1 to address #563
Describe the goals of the changes to the analysis module.
Here we will identify a method that we feel comfortable with using to annotate tumor cells in all samples in SCPCP000015. Then we will write a script that runs that method on a single SCE object and outputs annotations.
Based on findings in #532 and #558 I think we have two options:
Use AUCell with the pre-defined threshold calculated for SCPCL000822 and the marker gene list as the gene set.
Use SingleR with the combination of the SCPCL000822 reference and BlueprintEncodeData reference.
Use both of these methods and find the consensus between them. Any cells labeled as tumor in one and not the other would be labeled as ambiguous.
I personally am leaning towards using AUCell, because it utilizes a pre-defined set of tumor marker genes that we expect to be present in all tumor cells. SingleR uses tumor cells from one sample to define tumor cells in the rest of the samples which is probably fine for the most part, but given the heterogeneity of Ewing sarcoma, I am nervous we might miss some tumor cells. Although we are using a threshold from a single sample to define the auc cutoff, so both methods are somewhat biased.
What will your pull request contain?
A script to run the identified method on a single SCE object. This will ultimately be part of a workflow that runs the method on all samples in SCPCP000015.
Will you require additional software beyond what is already in the analysis module?
No
Will you require different computational resources beyond what the analysis module already uses?
No
If known, when do you expect to file the pull request?
One additional thought I have here is that we can start by just using AUCell, run it on all samples and then evaluate if we need to make any changes. Then we can decide if we need to also use SingleR.
If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.
292
Step 1 to address #563
Describe the goals of the changes to the analysis module.
Here we will identify a method that we feel comfortable with using to annotate tumor cells in all samples in SCPCP000015. Then we will write a script that runs that method on a single SCE object and outputs annotations.
Based on findings in #532 and #558 I think we have two options:
AUCell
with the pre-defined threshold calculated forSCPCL000822
and the marker gene list as the gene set.SingleR
with the combination of theSCPCL000822
reference andBlueprintEncodeData
reference.I personally am leaning towards using
AUCell
, because it utilizes a pre-defined set of tumor marker genes that we expect to be present in all tumor cells.SingleR
uses tumor cells from one sample to define tumor cells in the rest of the samples which is probably fine for the most part, but given the heterogeneity of Ewing sarcoma, I am nervous we might miss some tumor cells. Although we are using a threshold from a single sample to define the auc cutoff, so both methods are somewhat biased.What will your pull request contain?
A script to run the identified method on a single SCE object. This will ultimately be part of a workflow that runs the method on all samples in SCPCP000015.
Will you require additional software beyond what is already in the analysis module?
No
Will you require different computational resources beyond what the analysis module already uses?
No
If known, when do you expect to file the pull request?
No response