Hi, I need to generate a reference genome with 4 additional genes (4 fluo proteins) from mouse refgen mm10.
I already had a working refgen with them but with a previous version of STAR, so now I have to re-generate it (STAR 2.7.11b).
Following instructions, I added the 4 genes at the end of gtf file, tab separated:
It generates the refgen without errors, same with the alignment done (as always) with:
STAR --genomeDir=./STAR_index --readFilesIn=R2_001.fastq.gz, R2_001.fastq.gz --runThreadN=12 --soloType Droplet --soloCBwhitelist mylist.txt --soloUMIfiltering MultiGeneUMI --soloCBmatchWLtype 1MM_multi_pseudocounts --soloUMIlen 12 --sjdbGTFfile=genes_FLUO.gtf --readFilesCommand zcat
The problem arises when I load my features, barcodes and matrix to create an anndata object:
ValueError: Length of values (31057) does not match length of index (31053)
As if those 4 genes are not actually indexed, maybe?
What am I doing wrong?
Thank you so much for your help!
Hi, I need to generate a reference genome with 4 additional genes (4 fluo proteins) from mouse refgen mm10. I already had a working refgen with them but with a previous version of STAR, so now I have to re-generate it (STAR 2.7.11b).
Following instructions, I added the 4 genes at the end of gtf file, tab separated:
and added the respective sequences at the end of the fasta file (.fa), like:
and generated the refgen with:
It generates the refgen without errors, same with the alignment done (as always) with:
STAR --genomeDir=./STAR_index --readFilesIn=R2_001.fastq.gz, R2_001.fastq.gz --runThreadN=12 --soloType Droplet --soloCBwhitelist mylist.txt --soloUMIfiltering MultiGeneUMI --soloCBmatchWLtype 1MM_multi_pseudocounts --soloUMIlen 12 --sjdbGTFfile=genes_FLUO.gtf --readFilesCommand zcat
The problem arises when I load my features, barcodes and matrix to create an anndata object:
ValueError: Length of values (31057) does not match length of index (31053)
As if those 4 genes are not actually indexed, maybe?What am I doing wrong? Thank you so much for your help!