Open dmncyap opened 2 years ago
Hello! It is fairly common for samples to be split into multiple separate fastq files. When that is the case, you have a couple options. You could concatenate the separate fastq files prior to running STAR. For example, if you had 3 files per sample:
cat sample1_file1.fastq.gz sample1_file2.fastq.gz sample1_file3.fastq.gz > sample1_combined.fastq.gz
Do that for all your samples (could write a loop of some sort) and then feed the combined samples into STAR as you normally would.
Alternatively you could just feed all the names of the separate files for each sample, separated by commas, into the "--readFilesIn" for STAR (e.g. --readFilesIn file1.fastq.gz,file2.fastq.gz,file3.fastq.gz). STAR will essentially concatenate them for you during alignment. Although this works, I recommend merging the files beforehand. If you do choose to do this, make sure the files are separated by commas and not spaces. The commas denote multiple files per sample, and a space is used to separate paired-end mates if you are analyzing paired-end data.
Hope this helps! Erick
Hi again! Some of the .sra runs I'm working with seem to have been split into smaller .sra runs. From my understanding, that means I have to merge the .bam files after aligning each of these files. Will this mean I can't use --quantmode geneCounts in STAR because it will count the reads before the bam files get merged? Thank you!