alexdobin / STAR

RNA-seq aligner
MIT License
1.77k stars 497 forks source link

"--genomeTransformOutput" only worked on the first chromosome in the genome #2077

Open kuowenhsi opened 4 months ago

kuowenhsi commented 4 months ago

I used a VCF file to generate a consensus haploid reference genome. Then, I mapped the RNA-seq reads to the reference. I found the output bam file only showed correctly transformed coordination on the 1st chromosome in the reference. All the other chromosome showed wrong coordination. This impedes the following analyses, including using FeatureCounts. Do people experience the similar thing?

The script I used is attached below:

STAR --runThreadN 16 --runMode genomeGenerate --genomeDir /storage1/fs1/kolsen/Active/Wen/Clover_ref_genome/drTriRepe4_hap1_1.5_CONSENSUS_STAR_haploid \
--genomeFastaFiles /storage1/fs1/kolsen/Active/Wen/Clover_ref_genome/HiFi_HiC_LM_combined_v_1.5.fasta \
--sjdbGTFfile /storage1/fs1/kolsen/Active/Wen/Clover_ref_genome/Hap1_v1.5_exon.gtf \
--sjdbOverhang 149 \
--sjdbGTFfeatureExon exon \
--genomeSAindexNbases 13 \
--genomeTransformVCF /storage1/fs1/kolsen/Active/Wen/Clover_ref_genome/re_sequencing_data/adapt_trimmed/bowtie2_parents_total_v1.4_hardfilter.vcf \
--genomeTransformType Haploid \

STAR --runThreadN 8 --genomeDir /storage1/fs1/kolsen/Active/Wen/Clover_ref_genome/drTriRepe4_hap1_1.5_CONSENSUS_STAR_haploid \
    --outFileNamePrefix /storage1/fs1/kolsen/Active/Wen/RNA_seq_OmniC/map_"${ACCESSIONS_S[$i]}"_CONSENSUS_2passBasic/map_"${ACCESSIONS_S[$i]}"_CONSENSUS_2passBasic \
    --outReadsUnmapped Fastx \
    --readFilesCommand zcat \
    --quantMode GeneCounts \
    --outSAMtype BAM SortedByCoordinate \
    --outSAMattributes NH HI AS nM ha \
    --twopassMode Basic \
    --outFilterScoreMinOverLread 0.1 \
    --outFilterMatchNminOverLread 0.1 \
    --genomeTransformOutput SAM SJ Quant \
    --readFilesIn \
    "${ACCESSIONS[$i]}"_L002_R1_001.fastq.gz \
    \
    "${ACCESSIONS[$i]}"_L002_R2_001.fastq.gz
kuowenhsi commented 4 months ago

Here are some snapshots. They are the same sample with the "consensus reference" (2nd track) and "regular reference" (3rd track) in the mapping step. jbrowse1 jbrowse2