Closed SalimMegat closed 1 year ago
Now it detects 36 fastq but process them as single end....
Can you let me know which SRA dataset you have used and how you to download?
Thank you,
Hyun-Hwan Jeong
I directly downloaded the fastq from https://www.ebi.ac.uk/ena/browse/read-download. The files seem to be ok since i have been able to map them with STAR without any problem..
Salim.
@SalimMegat got it, but I also need to know what your dataset ID is. Or, if you have any script to download then can you share to my email?
Thank you,
Hyun-Hwan Jeong
Hi,
Actually on this run I downloaded the SRA with the script below and converted to fastq
Salim
Sorry for the late response, I will be busy until Tuesday. I can update after that day. Does it work for you?
Thank you,
Hyun-Hwan Jeong
Hi,
I finally decided to run it as single end reads and it worked although I'd really like to understand why this is the only set of fastq where salmonTE does not recognize paired-end reads. I have another question regarding output and downstream analysis. SalmonTE gives count for different TE families but my question is more related on how you tracked these TE down afterwards ? Are you using a bed file with all TE annotation ? In brief I'd like to have some visual information on whether or not this particular TE goes up or not ?
Best,
Salim.
Hello @SalimMegat,
I finally decided to run it as single end reads and it worked although I'd really like to understand why this is the only set of fastq where salmonTE does not recognize paired-end reads.
I would say this is not the issue of your side, and it certainly the issue of SalmonTE. I suspect there are some incompatibilities between the inside of the FASTQ files and SalmonTE. But it certainly can be fixed. I will give you some update soon.
Are you using a bed file with all TE annotation ? In brief I'd like to have some visual information on whether or not this particular TE goes up or not ?
I don't have any bed file, but I have used annotation from Dfam(https://dfam.org/) for the visualization and additional analysis. It would be great to answer if you provide additional information about what type of visualization you want to do, but the question is limited to understand.
Thank you,
Hyun-Hwan Jeong
@SalimMegat, I need to ask how you have converted SRA files to FASTQ files. Can you share your command for the conversion?
Thank you,
Hyun-Hwan Jeong
@hyunhwaj I have downloaded the SRA using the following command: wget ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR692/SRR6924174/SRR6924174.sra -P /b/home/medecine/smegat/
and I converted it to fastq using this command: while read x do
fastq-dump -I -v --split-3 --gzip /b/home/medecine/smegat/white_20month_FC_fastq/$x.sra
done < sra_list.txt
Do u want me to share a small subset of the fastq I am using ?
Salim.
@SalimMegat Thanks for providing this, and I will look into it!
Best,
Hyun-Hwan Jeong
Hi again,
I am actually still facing an issue with the mouse data I have. When I try to run on a set of fastq files downloaded from NCBI I get the following error: (snakemake) [smegat@hpc-login1 scripts]$ ./mouseSalmonTE.sh 2019-06-20 15:48:40,434 Starting quantification mode 2019-06-20 15:48:40,435 Collecting FASTQ files... 2019-06-20 15:48:40,435 SalmonTE assumes that '/b/home/medecine/smegat/fus_tdp43_fastq/tdp43_fastq/' is a directory, and SalmonTE will search any FASTQ file in the directory. 2019-06-20 15:48:50,369 A paired-end sample and a single-end sample are placed together.
Which I do not understand since all my files in the folder seem to be properly paired....
Many thanks,
Salim.