mpi2 / impc_stats_pipeline

R packages for the stats pipeline
https://www.mousephenotype.org/
Apache License 2.0
0 stars 2 forks source link

Check if HEM statistical analysis is being done by using only male WT #38

Open ficolo opened 8 months ago

marinak-ebi commented 8 months ago

The general question is how genotype P-value was generated?

Charts represent boxplot for four groups: female HEM (empty), female WT, male HEM, male WT. But there is no female hemizygote individuals, If we look at the raw data, we will see, that there is no female hemizygote, because it is not possible by definition:

https://www.ebi.ac.uk/mi/impc/solr/experiment/select?q=*:*&fq=zygosity:%22hemizygote%22
>>> numFound: 1005229
https://www.ebi.ac.uk/mi/impc/solr/experiment/select?q=*:*&fq=zygosity:%22hemizygote%22&fq=sex:%22male%22
>>> numFound: 1005229
https://www.ebi.ac.uk/mi/impc/solr/experiment/select?q=*:*&fq=zygosity:%22hemizygote%22&fq=sex:%22female%22
>>> numFound: 0

How was this analysis performed? Was this statistical analysis done by including male WT only or it included both male WT and female WT as a controls?

For example go to Mid2 gene page → Phenotypes → All data → Lymphocyte differential count → Supporting data.

This is not the only case. We need to understand how pipeline process data in case only one sex is present, is it use both sex controls or only one of them. Preferably it should be made until next release in April.