**Describe the bug
There was a option to specify the snp position file format and output format.
-f Format of SNPs file [ddc1,maf,db,dccq,vcf,torrent,RNA,DNA], Def=dcc1 and
--of Output file format [rows,columns].
we doubt these description is incorrect or ambiguous:
some SNP file formats are deprecated (dcc1, dccq);
the "compoundsnp" mode should be deprecated due to it can only take dcc1 format.
s ome format is not supported and got java exception ("dq", "torrent", "RNA", "DNA")
"txt", "tab", "columns", "vcf" and "maf" seems supported format but behave different with "--of" option
To Reproduce
Here we tested "-f" and "--of" and three sets of SNP position file:
SNP position file in txt format with tab delimiter eg. DPYD_97981343_A_C,T chr1 97981343 97981343 A C,T. Below five runs, two failed, three succeed and output the same in column format.
-f txt --of columns: run succeed, but log file shows the input and output are "columns" format.
-f tab --of columns: run succeed, but log file shows the input and output are "columns" format.
-f columns --of columns: run succeeds, the log file shows the input and output are "columns" format.
-f maf --of columns: run succeeds, the log file shows the input and output are "columns" format.
-f vcf --of columns: run succeeds, the log file shows the input and output are "columns" format.
We use same SNP position file as above but not use "--of" option, or other value, three runs failed, two succeed and output the same in row format.
-f txt --of any: run failed, throw an exception. The reason "txt" is not supported.
-f tab --of any: run succeed, but log file shows the input is "tab" format.
-f columns --of any: run succeeds, the log file shows the input is "columns" format.
-f maf --of any: run failed, throw an exception. The reason is the input is not MAF format.
-f vcf --of any: run failed, throw an exception. The reason is the input is not VCF format.
4.SNP position file in MAF format. Both runs succeed and output same.
-f maf --of columns: run failed, the log file shows the input and output are "columns" format. Because it is maf but not columns.
-f maf --of any: run succeeds, the log file shows the input is "maf" format.
SNP position file in VCF formte
-f vcf --of columns: run failed, the log file shows the input and output are "columns" format. Because it is vcf but not columns.
-f vcf --of any: run succeeds, the log file shows the input is "vcf" format.
Expected behavior
"--of columns" can only work for "column" format SNP file. The existing code always converts any format to "columns" if this option is specified.
"tab" and "column" format are treated the same. it does not matter we specify the format as "tab" or "column" for the same snp position file but got the same output. "columns" should be removed to avoid unnecessary confusion. Also we make it consistent to coverage mode which takes "tab" but not "columns".
"txt" is not supported; "columns" should be removed because the code treats it the same as "tab".
SNP mode should only support "columns", "maf" and "vcf" format, here "dccq" and "dcc1" are deprecated
Desktop (please complete the following information):
OS: mac or linux
Version : adamajava version 98-0412007c
Additional context
Add any other context about the problem here.
**Describe the bug There was a option to specify the snp position file format and output format.
-f Format of SNPs file [ddc1,maf,db,dccq,vcf,torrent,RNA,DNA], Def=dcc1
and--of Output file format [rows,columns].
we doubt these description is incorrect or ambiguous:To Reproduce
Here we tested "-f" and "--of" and three sets of SNP position file:
DPYD_97981343_A_C,T chr1 97981343 97981343 A C,T
. Below five runs, two failed, three succeed and output the same in column format.-f txt --of columns
: run succeed, but log file shows the input and output are "columns" format.-f tab --of columns
: run succeed, but log file shows the input and output are "columns" format.-f columns --of columns
: run succeeds, the log file shows the input and output are "columns" format.-f maf --of columns
: run succeeds, the log file shows the input and output are "columns" format.-f vcf --of columns
: run succeeds, the log file shows the input and output are "columns" format.-f txt --of any
: run failed, throw an exception. The reason "txt" is not supported.-f tab --of any
: run succeed, but log file shows the input is "tab" format.-f columns --of any
: run succeeds, the log file shows the input is "columns" format.-f maf --of any
: run failed, throw an exception. The reason is the input is not MAF format.-f vcf --of any
: run failed, throw an exception. The reason is the input is not VCF format.4.SNP position file in MAF format. Both runs succeed and output same.
-f maf --of columns
: run failed, the log file shows the input and output are "columns" format. Because it is maf but not columns.-f maf --of any
: run succeeds, the log file shows the input is "maf" format.-f vcf --of columns
: run failed, the log file shows the input and output are "columns" format. Because it is vcf but not columns.-f vcf --of any
: run succeeds, the log file shows the input is "vcf" format.Expected behavior
Desktop (please complete the following information):
Additional context Add any other context about the problem here.