TheJacksonLaboratory / LIRICAL

LIkelihood Ratio Interpretation of Clinical AbnormaLities
https://thejacksonlaboratory.github.io/LIRICAL/stable
Other
24 stars 11 forks source link

Extraneous tab in TSV header #656

Closed gileshall closed 5 months ago

gileshall commented 5 months ago

Lirical 2.0.1 generates an extraneous tab in the column header for TSV outputs. This header specifies nine columns, but subsequent rows are only eight columns. Loading the TSV file into Python and splitting it by the tab character returns this list:

['rank', 'diseaseName', 'diseaseCurie', 'pretestprob', 'posttestprob', 'compositeLR', '', 'entrezGeneId', 'variants']

Note the empty column name between compositeLR and entrezGeneId

For reference, this is how Lirical was executed:

lirical prioritize \
  --assembly=hg38 \
  --vcf=input_variants.vcf.gz \
  --sample-id=sample_name \
  --observed-phenotypes=HP:0001,HP:0002 \
  --sex=MALE \
  --exomiser-hg38=/lirical/lirical-cli-2.0.1/data/2302_hg38_variants.mv.db \
  --parallelism=20 \
  --prefix=sample_name_lirical \
  --output-format=tsv \
  --output-directory=lirical-results \
  --data=/lirical/lirical-cli-2.0.1/data
ielis commented 5 months ago

Hi @gileshall yes, there is an additional \t in the TSV template. I will fix the template to remove the extra \t. Thank you for reporting, I will keep you posted.

ielis commented 5 months ago

Please use v2.0.2 and reopen if the issue persists. Thank you for reporting!