SRA runs are split - Githubissues

Hello! It is fairly common for samples to be split into multiple separate fastq files. When that is the case, you have a couple options. You could concatenate the separate fastq files prior to running STAR. For example, if you had 3 files per sample:

cat sample1_file1.fastq.gz sample1_file2.fastq.gz sample1_file3.fastq.gz > sample1_combined.fastq.gz

Do that for all your samples (could write a loop of some sort) and then feed the combined samples into STAR as you normally would.

Alternatively you could just feed all the names of the separate files for each sample, separated by commas, into the "--readFilesIn" for STAR (e.g. --readFilesIn file1.fastq.gz,file2.fastq.gz,file3.fastq.gz). STAR will essentially concatenate them for you during alignment. Although this works, I recommend merging the files beforehand. If you do choose to do this, make sure the files are separated by commas and not spaces. The commas denote multiple files per sample, and a space is used to separate paired-end mates if you are analyzing paired-end data.

Hope this helps! Erick

erilu / bulk-rnaseq-analysis

SRA runs are split #2