YuSugihara / QTL-seq

QTL-seq pipeline to identify causative mutations responsible for a phenotype
46 stars 23 forks source link

SNP index calculation error #12

Closed praneetha92 closed 3 years ago

praneetha92 commented 3 years ago

Hi @YuSugihara I came across this error, i couldn't figure out whether there is a problem while installing the package or some technical issue. Can you please help me out. Screenshot from 2020-12-01 19-56-09

YuSugihara commented 3 years ago

Can you see the file of "qtlseq_bam_fainal/40_qtlseq/snp_index.tsv.temp" from the directory where you executed QTL-seq? Error message mentioned it.

You can restart the final stage calculating SNP-index by qtlplot command. If you will fail the command at the same point, please share the resulting files if possible. I will figure out your problem.

praneetha92 commented 3 years ago

Hi @YuSugihara The file folder which is created for 40_qtlseq is empty, i tried qtlplot command using the vcf files, there are two files in the vcf directory namely qtlseq.vcf.gz and qtlseq.vcf.gz.tbi, i used the latter one. This is the output. Screenshot from 2020-12-07 16-06-37

praneetha92 commented 3 years ago

Hi @YuSugihara I used qtlseq.vcf.gz file and i ended up having the error which i mentioned in my first comment.

YuSugihara commented 3 years ago

How about BAM files in 20_bams? Maybe, the sequence reads were too thin.

Also, please conform to input proper files. If you put the same file into bulk1 and bulk2, there are no variants between them.

praneetha92 commented 3 years ago

Hi @YuSugihara The sizes of bam files are as follows

  1. bulk1.filt - 3.6 GB
  2. bulk2.filt - 1.2 GB
  3. parent.filt - 712 bytes I made sure that bulk1 and bulk 2 are different but i couldn't solve the error, can you please tell me whether the format(i.e uncompressed or gz format) of the fastq matter while creating the Bam file?
YuSugihara commented 3 years ago

The BAM file of the parental cultivar is too small. I guess it is the causal problem.

2020年12月11日(金) 20:21 praneetha notifications@github.com:

Hi @YuSugihara https://github.com/YuSugihara The sizes of bam files are as follows

  1. bulk1.filt - 3.6 GB
  2. bulk2.filt - 1.2 GB
  3. parent.filt - 712 bytes I made sure that bulk1 and bulk 2 are different but i couldn't solve the error, can you please tell me whether the format(i.e uncompressed or gz format) of the fastq matter while creating the Bam file?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/YuSugihara/QTL-seq/issues/12#issuecomment-743137018, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIH5WMVYETQBSY4AHS5I3ULSUH6DNANCNFSM4UP24WOA .

praneetha92 commented 3 years ago

Hi @YuSugihara Can you please tell me if there is anything to do on my side that i might fix,in order to solve the error?

YuSugihara commented 3 years ago

If you input a FASTQ file, please confirm that it has the proper content. No sequence reads cannot produce the result.

If you directly input the BAM file, your alignment fails.

praneetha92 commented 3 years ago

Hi @YuSugihara In the test dataset can you please tell me from where did you consider taking the parent's(nortai and hitomebore) fastq sequence

YuSugihara commented 3 years ago

Sorry, I could not understand what you mean. If you want to try the test dataset, you can copy and paste the lines in 'qtlseq.sh'.

praneetha92 commented 3 years ago

Hi @YuSugihara can you please tell me whether you downloaded the parent's fastq from NCBI?

YuSugihara commented 3 years ago

Fastq files for the test dataset are stored in the directory of 'test' on Github. Do you want to download the full Fastq?

praneetha92 commented 3 years ago

Hi @YuSugihara Yes i want to download full fastq

YuSugihara commented 3 years ago

Ok. If you want to try full fastq files, please check our preprints. You can google the IDs described in the caption of Fig. 1. https://www.biorxiv.org/content/10.1101/2020.06.28.176586v1.full.pdf

They are deposited on DRAsearch.