hyunhwan-jeong / SalmonTE

SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances
GNU General Public License v3.0
81 stars 23 forks source link

Error using SalmonTE #34

Closed SalimMegat closed 1 year ago

SalimMegat commented 5 years ago

Hi again,

I am actually still facing an issue with the mouse data I have. When I try to run on a set of fastq files downloaded from NCBI I get the following error: (snakemake) [smegat@hpc-login1 scripts]$ ./mouseSalmonTE.sh 2019-06-20 15:48:40,434 Starting quantification mode 2019-06-20 15:48:40,435 Collecting FASTQ files... 2019-06-20 15:48:40,435 SalmonTE assumes that '/b/home/medecine/smegat/fus_tdp43_fastq/tdp43_fastq/' is a directory, and SalmonTE will search any FASTQ file in the directory. 2019-06-20 15:48:50,369 A paired-end sample and a single-end sample are placed together.

Which I do not understand since all my files in the folder seem to be properly paired....

Capture d’écran 2019-06-20 à 15 51 15

Many thanks,

Salim.

SalimMegat commented 5 years ago

Now it detects 36 fastq but process them as single end....

Capture d’écran 2019-06-20 à 17 02 35 Capture d’écran 2019-06-20 à 17 02 43
hyunhwan-jeong commented 5 years ago

Can you let me know which SRA dataset you have used and how you to download?

Thank you,

Hyun-Hwan Jeong

SalimMegat commented 5 years ago

I directly downloaded the fastq from https://www.ebi.ac.uk/ena/browse/read-download. The files seem to be ok since i have been able to map them with STAR without any problem..

Salim.

hyunhwan-jeong commented 5 years ago

@SalimMegat got it, but I also need to know what your dataset ID is. Or, if you have any script to download then can you share to my email?

Thank you,

Hyun-Hwan Jeong

SalimMegat commented 5 years ago

Hi,

Actually on this run I downloaded the SRA with the script below and converted to fastq

wget_files.txt

Salim

hyunhwan-jeong commented 5 years ago

Sorry for the late response, I will be busy until Tuesday. I can update after that day. Does it work for you?

Thank you,

Hyun-Hwan Jeong

SalimMegat commented 5 years ago

Hi,

I finally decided to run it as single end reads and it worked although I'd really like to understand why this is the only set of fastq where salmonTE does not recognize paired-end reads. I have another question regarding output and downstream analysis. SalmonTE gives count for different TE families but my question is more related on how you tracked these TE down afterwards ? Are you using a bed file with all TE annotation ? In brief I'd like to have some visual information on whether or not this particular TE goes up or not ?

Best,

Salim.

hyunhwan-jeong commented 5 years ago

Hello @SalimMegat,

I finally decided to run it as single end reads and it worked although I'd really like to understand why this is the only set of fastq where salmonTE does not recognize paired-end reads.

I would say this is not the issue of your side, and it certainly the issue of SalmonTE. I suspect there are some incompatibilities between the inside of the FASTQ files and SalmonTE. But it certainly can be fixed. I will give you some update soon.

Are you using a bed file with all TE annotation ? In brief I'd like to have some visual information on whether or not this particular TE goes up or not ?

I don't have any bed file, but I have used annotation from Dfam(https://dfam.org/) for the visualization and additional analysis. It would be great to answer if you provide additional information about what type of visualization you want to do, but the question is limited to understand.

Thank you,

Hyun-Hwan Jeong

hyunhwan-jeong commented 5 years ago

@SalimMegat, I need to ask how you have converted SRA files to FASTQ files. Can you share your command for the conversion?

Thank you,

Hyun-Hwan Jeong

SalimMegat commented 5 years ago

@hyunhwaj I have downloaded the SRA using the following command: wget ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR692/SRR6924174/SRR6924174.sra -P /b/home/medecine/smegat/

and I converted it to fastq using this command: while read x do

fastq-dump -I -v --split-3 --gzip /b/home/medecine/smegat/white_20month_FC_fastq/$x.sra

done < sra_list.txt

Do u want me to share a small subset of the fastq I am using ?

Salim.

hyunhwan-jeong commented 5 years ago

@SalimMegat Thanks for providing this, and I will look into it!

Best,

Hyun-Hwan Jeong