One gene in the expression file has two entries with different entrez ids:
hugo entrez TCGA-OR-A5J1-01A-11R-A29S-07
1 SLC35E2 728661 3293.2000
2 SLC35E2 9906 35.3314
In cases where another TCGA file makes use of SLC35E2 but does not have an associated entrez ID use 9906 which will map to SLC35E2A in the future.
For Driver mutations these 14 genes do not appear in the expression file:
entrez hgnc
<int> <chr>
1 84962 AJUBA
2 139285 AMER1
3 2909 ARHGAP35
4 3125 HLA-DRB3
5 3126 HLA-DRB4
6 284058 KANSL1
7 3803 KIR2DL2
8 4297 KMT2A
9 9757 KMT2B
10 58508 KMT2C
11 8085 KMT2D
12 57466 SCAF4
13 6427 SRSF2
14 7114 TMSB4X
For these 14 genes use the entrez ids that Shane found(listed). This may result in entrez ids with more than one mutation. This is OK if they have different mutation codes. If they have identical mutation codes we won't be able to include them, and drop the mutation that corresponds to one of the above genes.
The data base will have the entrez to hugo mapping using this file:
https://www.synapse.org/#!Synapse:syn21788372
All other data sources that make use of genes should be mapped to entrez with a mapping appropriate to the project they came from.
All non driver mutation TCGA files should use the mapping provided in the tcga expression file:
https://www.synapse.org/#!Synapse:syn4976369
One gene in the expression file has two entries with different entrez ids:
1 SLC35E2 728661 3293.2000
2 SLC35E2 9906 35.3314
In cases where another TCGA file makes use of SLC35E2 but does not have an associated entrez ID use 9906 which will map to SLC35E2A in the future.
For Driver mutations these 14 genes do not appear in the expression file:
entrez hgnc
1 84962 AJUBA
2 139285 AMER1
3 2909 ARHGAP35
4 3125 HLA-DRB3
5 3126 HLA-DRB4
6 284058 KANSL1
7 3803 KIR2DL2
8 4297 KMT2A
9 9757 KMT2B
10 58508 KMT2C
11 8085 KMT2D
12 57466 SCAF4
13 6427 SRSF2
14 7114 TMSB4X
For these 14 genes use the entrez ids that Shane found(listed). This may result in entrez ids with more than one mutation. This is OK if they have different mutation codes. If they have identical mutation codes we won't be able to include them, and drop the mutation that corresponds to one of the above genes.