Open wangshun1121 opened 5 years ago
The 50 genes from PAR locus are here, from your gene table
id | gene | chrom | chromStart | chromEnd | strand |
---|---|---|---|---|---|
ENSGR0000228572.6 | LL0YNC03-29C1.1 | chrY | 253743 | 255091 | + |
ENSGR0000182378.12 | PLCXD1 | chrY | 276322 | 303356 | + |
ENSGR0000178605.12 | GTPBP6 | chrY | 304529 | 318819 | - |
ENSGR0000226179.5 | LINC00685 | chrY | 320990 | 321851 | + |
ENSGR0000167393.16 | PPP2R3B | chrY | 333963 | 386955 | - |
ENSGR0000281849.2 | RP13-465B17.4 | chrY | 386980 | 405579 | + |
ENSGR0000275287.4 | Metazoa_SRP | chrY | 388100 | 388389 | - |
ENSGR0000280767.2 | RP13-465B17.5 | chrY | 419157 | 421980 | + |
ENSGR0000234958.5 | FABP5P13 | chrY | 523775 | 524102 | - |
ENSGR0000229232.5 | KRT18P53 | chrY | 545236 | 545352 | - |
ENSGR0000185960.12 | SHOX | chrY | 624344 | 659411 | + |
ENSGR0000237531.5 | RP11-309M23.1 | chrY | 990221 | 994365 | + |
ENSGR0000225661.6 | RPL14P5 | chrY | 1008503 | 1010101 | - |
ENSGR0000205755.10 | CRLF2 | chrY | 1187549 | 1212750 | - |
ENSGR0000198223.14 | CSF2RA | chrY | 1268800 | 1310381 | + |
ENSGR0000264510.5 | BX649553.3 | chrY | 1291755 | 1291828 | + |
ENSGR0000264819.5 | BX649553.4 | chrY | 1292094 | 1292167 | + |
ENSGR0000263980.5 | BX649553.2 | chrY | 1293615 | 1293689 | + |
ENSGR0000265658.5 | MIR3690 | chrY | 1293918 | 1293992 | + |
ENSGR0000263835.5 | BX649553.1 | chrY | 1294132 | 1294206 | + |
ENSGR0000223274.5 | RNA5SP498 | chrY | 1300256 | 1300375 | - |
ENSGR0000185291.10 | IL3RA | chrY | 1336616 | 1382689 | + |
ENSGR0000169100.12 | SLC25A6 | chrY | 1386152 | 1392724 | - |
ENSGR0000236871.6 | LINC00106 | chrY | 1396427 | 1399402 | + |
ENSGR0000236017.7 | ASMTL-AS1 | chrY | 1401769 | 1414028 | + |
ENSGR0000169093.14 | ASMTL | chrY | 1403139 | 1453762 | - |
ENSGR0000182162.9 | P2RY8 | chrY | 1462572 | 1537107 | - |
ENSGR0000197976.10 | AKAP17A | chrY | 1591593 | 1602514 | + |
ENSGR0000196433.11 | ASMT | chrY | 1615001 | 1643081 | + |
ENSGR0000223511.5 | RP13-297E16.4 | chrY | 1732584 | 1755985 | + |
ENSGR0000234622.5 | RP13-297E16.5 | chrY | 1767347 | 1768776 | + |
ENSGR0000169084.12 | DHRSX | chrY | 2219516 | 2502805 | - |
ENSGR0000223571.5 | DHRSX-IT1 | chrY | 2334295 | 2336410 | - |
ENSGR0000214717.9 | ZBED1 | chrY | 2486414 | 2500967 | - |
ENSGR0000277120.4 | MIR6089 | chrY | 2609191 | 2609254 | + |
ENSGR0000223773.6 | CD99P1 | chrY | 2609348 | 2657229 | + |
ENSGR0000230542.5 | LINC00102 | chrY | 2612988 | 2615347 | - |
ENSGR0000002586.17 | CD99 | chrY | 2691179 | 2741309 | + |
ENSGR0000168939.10 | SPRY3 | chrY | 56954332 | 56968979 | + |
ENSGR0000237801.5 | AMD1P2 | chrY | 57015105 | 57016096 | - |
ENSGR0000237040.5 | DPH3P2 | chrY | 57062156 | 57062405 | + |
ENSGR0000124333.14 | VAMP7 | chrY | 57067813 | 57130289 | + |
ENSGR0000228410.5 | TCEB1P24 | chrY | 57165512 | 57165845 | - |
ENSGR0000223484.6 | TRPC6P | chrY | 57171890 | 57172769 | - |
ENSGR0000124334.16 | IL9R | chrY | 57184101 | 57197337 | + |
ENSGR0000270726.5 | AJ271736.10 | chrY | 57190738 | 57208756 | + |
ENSGR0000185203.11 | WASIR1 | chrY | 57201143 | 57203357 | - |
ENSGR0000182484.14 | WASH6P | chrY | 57207346 | 57212230 | + |
ENSGR0000276543.4 | AJ271736.1 | chrY | 57209151 | 57209218 | + |
ENSGR0000227159.7 | DDX11L16 | chrY | 57212184 | 57214397 | - |
My suggestion: These genes should be paired with their homogenous genes on chrX. Then expression value of these genes should be added to their homogenous genes, and then remove these genes on chrY from meta table of gene expression.
Dear @wangshun1121 ,
My suggestion: These genes should be paired with their homogenous genes on chrX. Then expression value of these genes should be added to their homogenous genes, and then remove these genes on chrY from meta table of gene expression.
Thank you immensely for this write-up and corresponding table. Due to demands that the workflow produce deterministic expression results that match the values on Xena, I have to freeze the default workflow inputs, but I've wanted to start hosting new sets of inputs as new Gencode annotations get released and the issues you've brought up are important points to consider.
I'll keep this issue open until I've properly addressed it. Please let me know if you have any other suggestions or improvements I can make, they're greatly appreciated.
Hello Dr Vivian:
I developed a raw pipeline following your instruction. I checked the RSEM gene expression results on Xena:
https://xenabrowser.net/datapages/?dataset=tcga_RSEM_gene_fpkm&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443
You have 50 more gene IDs than mine. Then I checked gene IDs between yours and mine, and found that all these 50 gene IDs are begin with ENSGR. What's more, these genes are all from PAR locus on the Y chromosome.
Your results indicate that you didn't remove these PAR locus in RSEM pipeline as you did in Kallisto. In my opinion, these genes from PAR locus should also be removed in RSEM pipelines.