Closed jburos closed 7 years ago
I understand the source of the problem. Could you please email me the offending seq2HLA
file. If I remember correctly, that chick mark had some special significance to seq2HLA
that I'll dig into.
Thanks @rleonid! I will email you the output from seq2HLA for this sample. Agreed, I do think there is some significance to these.
I think the latest version parses this correctly, I wonder what version you ended up using.
$ ./hlarp seq2HLA test/mnt/nfs-pool-rcc/biokepi/results/bms/RCC_v01_with-rna_Sample_11_8-normal-CA209009_11_8_N-tumor-CA209009_11_8_T_SCR-rna-CA209009_11_8_T_C2D8-b37/48c05b974b6d025ad8ef06bb9470bef9rna-CA209009_11_8_T_C2D8edsl-concatseq2hla-workdir/
class,allele,qualifier,confidence,run
1,A*02:01',,0.001507,rna-CA209009_11_8_T_C2D8
1,A*29:02,,0.000001,rna-CA209009_11_8_T_C2D8
1,B*44:02,,0.048383,rna-CA209009_11_8_T_C2D8
1,B*44:02',,0.000000,rna-CA209009_11_8_T_C2D8
1,C*05:01,,0.038153,rna-CA209009_11_8_T_C2D8
1,C*16:01,,0.067506,rna-CA209009_11_8_T_C2D8
2,DQA1*02:01,,,rna-CA209009_11_8_T_C2D8
2,DQA1*02:01,,0.000000,rna-CA209009_11_8_T_C2D8
2,DQB1*02:02',,0.000000,rna-CA209009_11_8_T_C2D8
2,DQB1*03:02',,0.000407,rna-CA209009_11_8_T_C2D8
2,DRB1*07:01,,0.000000,rna-CA209009_11_8_T_C2D8
2,DRB1*14:103,,0.279392,rna-CA209009_11_8_T_C2D8
Though it is true that this is apostrophe is an ambiguity indicator in seq2HLA
output (ie. https://bitbucket.org/hammerlab/seq2hla/src/edc2b613de435e88bf6fc688324613ad50ee7453/seq2HLA.py?at=default&fileviewer=file-view-default#seq2HLA.py-227).
Looking through my notes, I can't remember the exact (statistical) meaning of ambiguous in this case.
I am getting the following error when reading seq2HLA output:
When I look at the seq2HLA output, I see a bad character (
'
) in the*-ClassI.HLAgenotype4digits
file. This may or may not be the cause of the error, but it does happen to be the 31st character in this file.This character appears in a number of our seq2HLA outputs from epidisco. Not sure if this is an epidisco issue or a now standard output from seq2HLA that should be accommodated in hlarp.
Let me know if I can provide more info.