lgmgeo / AnnotSV

Annotation and Ranking of Structural Variation
GNU General Public License v3.0
211 stars 35 forks source link

Difference in results between "knotAnnotSV online" and "knotAnnotSV command-line" versions #190

Closed lmanchon closed 1 year ago

lmanchon commented 1 year ago

--Hi,

i used AnnotSV and knotAnnotSV (last version from github) with:

$ANNOTSV/bin/AnnotSV -SVinputFile my_AnnotSV.bed -outputFile ./my_annotated.tsv -svtBEDcol 4 -vcf 1 perl ./knotAnnotSV.pl --annotSVfile my_annotated.tsv --configFile ./config_AnnotSV.yaml --outDir ./example

my bed file below in attachment. my_AnnotSV.bed.txt

Parsing config file.... Parsing AnnotSV header in order to get column names ... Undefined field (typo error in config file or absent in annotation file (ex: exomiser): Human_pheno_evidence in Exomiser_gene_pheno_score Undefined field (typo error in config file or absent in annotation file (ex: exomiser): Mouse_pheno_evidence in Exomiser_gene_pheno_score Undefined field (typo error in config file or absent in annotation file (ex: exomiser): Fish_pheno_evidence in Exomiser_gene_pheno_score 5_2_22_0.0000_15_69531794_71331384_DUP_1 5_2_12_0.0000_11_63265070_64264587_DUP_1 3_3_24_0.0000_01_10502_640663_DEL_1 3_3_17_0.0000_01_12953116_13350064_DEL_1 3_3_5_0.0000_021_9838985_10668347_DEL_1 3_3_5_0.0000_016_33569890_35222814_DEL_1 3_3_4_0.0000_022_15054320_15766172_DEL_1 3_3_2_0.0000_022_10746895_12954787_DEL_1 3_3_2_0.0000_08_39334711_39533484_DEL_1 3_3_1_0.0000_07_111828740_112179547_DEL_1 3_3_0_0.0000_0Y_26441248_56869545_DEL_1 3_2_38_0.0000_021_7653346_9748815_DUP_1 3_2_3_0.0000_111_31490262_31792020_DUP_1 3_2_0_0.0000_014_18661686_18762857_DUP_1 3_2_0_0.0000_09_42280027_42480368_DUP_1 1_2_32_0.0000_015_20240479_22270223_DUP_1 1_2_27_0.0000_08_7312830_8000337_DUP_1 1_2_6_0.0000_014_105849139_106248712_DUP_1

Done!

I don't get the same results as the online versions, some columns don't appear in the html file generated with the command lines. can you tell me why ?

lgmgeo commented 1 year ago

Hi,

Various output formats are available online to visualize the AnnotSV results:

Your question looks like a "knot" question. Have you compared 2 knot HTML files (one obtained from the web server and the other from a command line)? What are the columns that are not appearing in the html file generated with the command lines?

This can come from the knot configuration file (config_AnnotSV.yaml).

Best, Véronique

lmanchon commented 1 year ago

--Hi,

in the html file generated with command lines, the last columns don't appear. see in attachment my input file and the 2 html file generated, in the file generated online the last columns "Callers, BAF..." are appearing. kannotsv.zip

lgmgeo commented 1 year ago

OK, got it. The 4 annotations from your BED input file (Score, Length(Mb), Callers, BAF) are not reported.

If you want to add them, you need to configure knotAnnotSV. The end of your knot configuration file (config_AnnotSV.yaml) should look like this: image

lmanchon commented 1 year ago

ok, i understand now. also i have tested vcf2circos using vcf output from AnnotSV and it failed:

python3 /home/NGS/AnnotSV/share/python3/variantconvert//variantconvert convert -i ./my_AnnotSV.tsv -o ./my_AnnotSV.vcf -fi annotsv -fo vcf -c /home/NGS/AnnotSV/share/python3/variantconvert//configs/GRCh38/annotsv3_from_bed.local.json

2023-08-10 10:03:59 [INFO] running variantconvert 1.2.2 2023-08-10 10:03:59 [INFO] variantconvert finished.

vcf2circos --input my_AnnotSV.vcf --options Static/options.json --output my_AnnotSV.html -a hg38

-. .--'"'--._ _.--'"'--. .--'"'--._ _ '-:.'||"':-. '-:.'||"':-. '-:.'||"':-. '. : '. '. '. | | | |'. '. | | | |'. '. | | | |'. '.: '. '. : '. '.| | | | '. '.| | | | '. '.| | | | '. '. : '. . ' '..: | :.' '. .:_ | :_.' '..: | :.' '. .'. -..,..-'-..,..-' -..,..-' `

                 __ ___      _
                / _|__ \    (_)
     __   _____| |_   ) |___ _ _ __ ___ ___  ___
     \ \ / / __|  _| / // __| | '__/ __/ _ \/ __|
      \ V / (__| |  / /| (__| | | | (_| (_) \__ \
       \_/ \___|_| |____\___|_|_|  \___\___/|___/

Author: Jean-Baptiste Lamouche, Antony Le Bechec, Jin Cui Version: 1.1 Last update: Mars 26 2023

[INFO] Input file: my_AnnotSV.vcf (format 'vcf') [INFO] Output file: my_AnnotSV.html (format 'html') [INFO] Options provided.

Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 518, in _parse_info val = self._map(int, vals) File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 475, in _map return [func(x) if x not in bad else None for x in iterable] File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 475, in return [func(x) if x not in bad else None for x in iterable] ValueError: invalid literal for int() with base 10: '24.0|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/NGS/.local/bin/vcf2circos", line 33, in sys.exit(load_entry_point('vcf2circos', 'console_scripts', 'vcf2circos')()) File "/home/NGS/vcf2circos/vcf2circos/main.py", line 119, in main js = Datafactory(input_file, options).plot_dict() File "/home/NGS/vcf2circos/vcf2circos/datafactory.py", line 25, in plot_dict pc = Plotconfig( File "/home/NGS/vcf2circos/vcf2circos/plotcategories/plotconfig.py", line 100, in init self.data = self.process_vcf() File "/home/NGS/vcf2circos/vcf2circos/utils.py", line 481, in timeit_wrapper result = func(*args, **kwargs) File "/home/NGS/vcf2circos/vcf2circos/plotcategories/plotconfig.py", line 150, in process_vcf for record in self.vcf_reader: File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 706, in next info = self._parse_info(row[7]) File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 522, in _parse_info val = self._map(float, vals) File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 475, in _map return [func(x) if x not in bad else None for x in iterable] File "/usr/local/lib/python3.8/dist-packages/vcf/parser.py", line 475, in return [func(x) if x not in bad else None for x in iterable] ValueError: could not convert string to float: '24.0|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.|.'

the online version work nice with my vcf output but failed when i use the command lines. do you think the problem also comes from the json configuration file ?

lgmgeo commented 1 year ago

Actually I think so... but I don't feel competent to help you.

A new vcf2circos version is under active development by @JbaptisteLam. I was going to suggest that you contact Jean-Baptiste on the vcf2circos Github... but I see that you have preceded me :o) https://github.com/bioinfo-chru-strasbourg/vcf2circos/issues/31

lmanchon commented 1 year ago

yes that's right, i sent my failed command to vcf2circos Github. is @JbaptisteLam interacting with you for the website https://lbgi.fr/AnnotSV ? its tool works well online and generates the circos but not in command line.

lgmgeo commented 1 year ago

It's OK, I contacted JB and he will answer you asap.

Have fun with AnnotSV, knotAnnotSV and vcf2circos!

lmanchon commented 1 year ago

thank you, best.