igvteam / igv-reports

Python application to generate self-contained pages embedding IGV visualizations, with no dependency on original input files.
MIT License
349 stars 52 forks source link

Issues printing multiple INFO columns #43

Closed IvantheDugtrio closed 3 years ago

IvantheDugtrio commented 4 years ago

To whom this may concern,

I am trying to include multiple INFO field columns in the html report but for some reason it only prints values from the first column. Also the column header names are missing for the info-columns specified.

The following command was used: create_report SS-10441949-S-123019_S22_concensus.filtered.ann.vcf.gz GRCh37.fa --ideogram cytoBandIdeo.txt --info-columns cosmic_gene cosmic_hgvsc cosmic_hgvsp cosmic_legacy_id --tracks SS-10441949-S-123019_S22_concensus.bam --output SS-10441949-S-123019.html

The VCF was generated using GATK v4.1 HaplotypeCaller. Annotation was done with vcfanno and cosmic/clinvar VCFs. We have additional problems trying to parse the FORMAT:AF and FORMAT:AFDP fields into the html report.

See the attached annotated and filtered VCF, and html report generated. Rename the extension to .html before use.

SS-10441949-S-123019_S22_consensus.filtered.ann.vcf.gz SS-10441949-S-123019.html.txt

Thanks, Ivan

jrobinso commented 4 years ago

Thanks for the report and sample data. I'm looking into it now. I am hampered by 2 things, I am not really a python programmer and I'm not a VCF expert. Other than that no problem. It looks like there might be some indentation problem around some if/else statements.

jrobinso commented 4 years ago

Hmmm, the table is all there in html, its just very wide. See the screenshot. So the problem is in the html template, not the python

Screen Shot 2020-09-24 at 8 32 47 PM
jrobinso commented 4 years ago

Also, there is some discrepancy in the output html you posted and what I am generating. Could you verify that you are running the latest released version?

jrobinso commented 4 years ago

Never mind, discrepancy is because I removed some rows for testing speed. I think what's happening here is the table column widths are being controlled by the size of the maximum element in any of the rows. Some of the cosmic gene values are very long, causing the column to be around 5,000 pixels in width. If you generate the report above with just the first row of the vcf file I think you will see this. I don't know what the right answer here is, but the fix will be in the template file used for the html. I see you are using the default template (not specified), which is igv_reports/templates/variant_template.html.

jrobinso commented 4 years ago

The table is in a div that had "overflow:auto", which should have created a horizontal scrollbar for this table. Its just not working. So I removed that style, you can see the result with the attached. This is not a very satisfying solution for large content width in the cells, but will be the solution until someone can suggest a better one. This is not an easy problem.

SS-10441949-S-123019.html.zip

IvantheDugtrio commented 4 years ago

Thanks Jim for the work! I also worked on the VCF filtering tool VCFANNO to get fewer annotations per column, so that the data would be more manageable. See the attached VCF and regenerated report.

I'm still not sure what to do about the AD and AF fields and it seems to be problematic for every tool I've tried parsing them with. I'll consult with the guys at bcftools on how to parse these multi-allelic fields. SS-10441949-S-123019.zip SS-10441949-S-123019_S22_consensus.filtered.ann.vcf.gz

IvantheDugtrio commented 4 years ago

Also, does igv-reports support parsing a tab-separated table with defined columns such as CHR, POS, REF, ALT, AD, AF, COSMIC_ACC, CLINVAR_ACC, etc?

stevekm commented 4 years ago

I also am interested in parsing a more generic tab-separated table, instead of just VCF, though maybe that would be better for a separate GitHub Issue?

jrobinso commented 3 years ago

I don't think there are any open issues here, tab-delimited table input support (#50) has been added, it will be in the next release.