hammerlab / hlarp

Normalize HLA typing output.
Apache License 2.0
6 stars 1 forks source link

Parse seq2 hla ambiguity #26

Closed rleonid closed 7 years ago

rleonid commented 7 years ago

This fixes #25.

To be clear, the latest version of hlarp doesn't crash with this error, but just propagates seq2HLA's ambiguity apostrophe at the end of the allele name into the output. This PR creates a new field typer_spec as part of the Info.t record to carry "unparsed" (untyped) parser specific information into the normalized output.

For example the new output looks like:

/hlarp_cli.native seq2HLA /path/concatseq2hla-workdir/
class,allele,qualifier,confidence,typer specific,sample
1,A*02:01,,0.001507,ambiguity=true,rna-CA209009_11_8_T_C2D8
1,A*29:02,,0.000001,ambiguity=false,rna-CA209009_11_8_T_C2D8
1,B*44:02,,0.000000,ambiguity=true,rna-CA209009_11_8_T_C2D8
1,B*44:02,,0.048383,ambiguity=false,rna-CA209009_11_8_T_C2D8
1,C*05:01,,0.038153,ambiguity=false,rna-CA209009_11_8_T_C2D8
1,C*16:01,,0.067506,ambiguity=false,rna-CA209009_11_8_T_C2D8
2,DQA1*02:01,,,ambiguity=false,rna-CA209009_11_8_T_C2D8
2,DQA1*02:01,,0.000000,ambiguity=false,rna-CA209009_11_8_T_C2D8
2,DQB1*02:02,,0.000000,ambiguity=true,rna-CA209009_11_8_T_C2D8
2,DQB1*03:02,,0.000407,ambiguity=true,rna-CA209009_11_8_T_C2D8
2,DRB1*07:01,,0.000000,ambiguity=false,rna-CA209009_11_8_T_C2D8
2,DRB1*14:103,,0.279392,ambiguity=false,rna-CA209009_11_8_T_C2D8
rleonid commented 7 years ago

@ihodes or @jburos Can you take a look when you have a chance? This is not a high priority.

jburos commented 7 years ago

@rleonid in terms of functionality, looks great to me. thanks.