freeseek / gtc2vcf

Tools to convert Illumina IDAT/BPM/EGT/GTC and Affymetrix CEL/CHP files to VCF
MIT License
140 stars 24 forks source link

GTC files cannot be listed through both command interface and file list when only submitting a .txt file #48

Closed hsmith9002 closed 2 years ago

hsmith9002 commented 2 years ago

Hi,

I am getting the error message "GTC files cannot be listed through both command interface and file list" even though I am only submitting a single .txt file with a list of the gtc file names. I have tried this where the actual gtc files are in the directory where I am running the script, and also where they are in their own directory. I am running on a google cloud instance and using a singularity container. Here is the code, and I have attached the gtc_list file.

`bpm_manifest_file="./GDA_PGx-8v1-0_20042614_A2.bpm" csv_manifest_file="./ProjectDetailReport ILMN GDA 07-11-22 AMS1.csv" egt_cluster_file="./GDA FINAL 3 plate validation reclustered 06302022.egt" path_to_gtc_folder="./gtc_file_list.csv" ref="./GRCh38_full_analysis_set_plus_decoy_hla.fa" # or ref="$HOME/GRCh37/human_g1k_v37.fasta" out_prefix="206486390022"

singularity exec gtc2vcf_072922.sif bcftools +gtc2vcf \ --no-version -Ou \ --bpm $bpm_manifest_file \ --csv $csv_manifest_file \ --egt $egt_cluster_file \ --gtcs ./gtc_list_file.txt \ --fasta-ref $ref \ --output $out_prefix.vcf \ --output-type v \ --extra $out_prefix.tsv \ --verbose `

Thank you Harry gtc_list_file.txt

freeseek commented 2 years ago

The problem is that your csv_manifest_file and your egt_cluster_file variables contain spaces, so they are interepreted as additional options, i.e., additional GTC files that are not allowed when using the --gtcs command. Simply use quotes as follows to make it work:

bcftools +gtc2vcf
--no-version
--bpm $bpm_manifest_file
--csv "$csv_manifest_file"
--egt "$egt_cluster_file"
--gtcs ./gtc_list_file.txt
--fasta-ref $ref
--output $out_prefix.vcf
--output-type v
--extra $out_prefix.tsv
--verbose

Notice that using the option -Ou and the option --output-type v, while allowed, is incorrect as the later option will simply override the former option. Also, do notice that the csv_manifest_file can be input as a gzipped file

hsmith9002 commented 2 years ago

Thank you for the quick response!. That seemed to fix that, but now I' getting this: "Header of file ./ProjectDetailReport ILMN GDA 07-11-22 AMS1.csv is incorrect:".

The csv is attached. GTC_GDA_test_0826_ProjectDetailReport ILMN GDA 07-11-22 AMS1.csv

Thank you Harry

freeseek commented 2 years ago

Ah yeah, the manifest file name did seem weird to me. That's because GTC_GDA_test_0826_ProjectDetailReport ILMN GDA 07-11-22 AMS1.csv is more like a file report. You need to use the .csv manifest file for the GDA array that you can find here

hsmith9002 commented 2 years ago

I think I got it working using the _A2.csv.gz. I'm using the _A2.bpm.