Closed bshrestha0 closed 2 years ago
Hello @bshrestha0,
Thank you very much for your interest in our software finder
. It should have produced all the *tab files for all the RNA-Seq samples. I need to inspect it further. Could you please send me the metadata csv along with the genome file that you are trying to annotate? Also, I am not sure why the downloading of the data from NCBI is behaving erratically. I can see that the sample SRR14458419 does not have a correctly named file. Could you tell me what command you used to download the files?
Thank you.
Hi Sagnik,
Here's my metadata csv file:
BioProject,SRA Accession,Tissues,Description,Date,Read Length (bp),Ended,RNA Seq,process,Location
PRJNA548230,SRR9265068,not_available,cDNA;Illumina HiSeq 4000,3/15/17,150,PE,1,1,RNA_reads
PRJNA271608,SRR1741331,whole_worm,cDNA;Illumina HiSeq 2000,3/15/17,150,PE,1,1,RNA_reads
PRJNA727816,SRR14458419,whole_worm_stress,cDNA;DNBSEQ-G400,3/15/17,150,PE,1,1,RNA_reads
PRJNA215361,SRR953130,embryo,cDNA;Illumina HiSeq 2000,3/15/17,150,PE,1,1,RNA_reads
PRJNA215361,SRR953118,L2-L3,cDNA;Illumina HiSeq 2000,3/15/17,150,PE,1,1,RNA_reads
PRJNA574273,SRR10189241,adults_embryo,cDNA;Illumina HiSeq 2500,3/15/17,150,PE,1,1,RNA_reads
PRJNA215361,SRR953117,L1,cDNA;Illumina HiSeq 2000,3/15/17,150,PE,1,1,RNA_reads
The description and date in the metadata file may not be accurate but I guess it shouldn't matter. "RNA_reads" is the folder where I downloaded the SRA reads. When I used the Finder to download the data, I removed "RNA_reads" from the metadata csv file but kept the rest as it is. The genome file that I used is C. elegans softmasked genome available at the EnsemblGenome as a test run. I can email you the genome file if you could share me your id. Command that I used for the run:
finder -no_cleanup -mf elegans_metadata.csv -n 18 -gdir_star $PWD/star_index_without_transcriptome \
-out_dir $PWD/FINDER_elegans -g $genome/elegans_genome_sm-filtered.fa \
-p $PWD/ensembl_elegans.pep -gdir_olego $olego/olego_index -preserve -pc_clean
To download the SRA files, I used fastq-dump from sratoolkit in a loop:
while read LINE
do
let count++
fastq-dump --defline-seq '@$sn[_$rn]/$ri' --split-files $LINE
echo "$LINE"
done < $FILENAME
$FILENAME has the SRR id of the libraries that I want to download.
Thanks,
Bikash
Hi @bshrestha0,
Thank you for sending me the files. I will investigate further. Could you please send me the genome file elegans_genome_sm-filtered.fa
and the proteins ensembl_elegans.pep
? You could email those to sagnikbanerjee15@gmail.com
Thanks.
Hello, I've encountered a similar issue. Have you been able to find the reason behind it and a solution?
Hi @Maxim-Karpov,
Thank you for your patience. I found an issue with the code. The new version of finder
will take care of this.
Thank you.
Hi there,
I am getting an error while using Finder with RNA-Seq data available in my local directory. Here's the error:
Here's the output (partial) generated in the alignments directory:
As you can see there's no SRR9265068_round3_SJ.out.tab file create during the third round but it created SRR9265068_final_SJ.out.tab file. However, for some libraries it created "round3_SJ.out.tab" files after completing the third run as shown below:
I also tried using Finder to download SRA files but it didn't do a good job in downloading files properly. Please see the attachment. So, I downloaded the RNA reads in my local computer and used it as an input.
Any suggestions on how to fix this?
Thank you