Closed jingchunzhu closed 8 years ago
I just tested the latest master branch code, I see duplications of the same records on the same line. I have seen this before in our old code, i am not sure why that happens. In our old code, I just did a quick check per sample and only report the uniq records.
click on download and check results.
and create a new to adjust the "result" field to whatever the field label name is, e.g. TP53, or whatever the custom name would be? This would be for scenarios where the samples do not have mutations or don't have any data at all.
Don't worry about the above. Using "results" is good enough.
I just tested using the parameters you show above, e.g. TCGA Breast Cancer et al., and experienced the same problem (with many samples). I checked through the source data props.req.rows
, which contains all samples with mutations. It does in fact contain the duplicate records. Looping in Brian to get his take on this situation @acthp
Let me check the input datafile itself to see if there are any duplicates first.
I figured it out. The input data actually have duplications, which is not an artifact, but because there are multiple measurements done on the same tumor samples. The duplicated records representing the multiple measurements, in this case, one is done as initial discovery, the other is done as validation experiment.
Let's keep the way it is now.
@jingchunzhu Since my Pull Request has been merged, can we close this issue, and create a new to adjust the "result" field to whatever the field label name is, e.g. TP53, or whatever the custom name would be? This would be for scenarios where the samples do not have mutations or don't have any data at all.