Error in data.table::setnames: 'old' is length 7 but 'new' is length 2

honzee / RNAseqCNV

R package for large-scale CNV analysis from RNA-seq

MIT License

11 stars 8 forks source link

Error in data.table::setnames: 'old' is length 7 but 'new' is length 2 #18

Open deb0612 opened 1 year ago

deb0612 commented 1 year ago

Dear sir, when I tried to apply RNAseqCNV either by shinnyapp or by command, it show the error message below: My read counts was generated from featurecounts. "command: featureCounts -T 10 -a GENCODE28.gtf -o read.count -p -B -C -f -t exon -g gene_id *.bam" Is that the wrong format of my GTF file?

honzee commented 1 year ago

Hi,

thank you for trying out our package!

I think this might be caused by incorrectly formatted counts file. It is expected to have two columns (https://github.com/honzee/RNAseqCNV#213-read-count-file-), but I think your has 7. Could you send the first few lines for your counts file to validate this?

If this is the case, I would suggest to reformat the input count files.

Best, Jan

deb0612 commented 1 year ago

The count file looked like this: Geneid Chr Start End Strand Length A2969_S226_L003.pb.hg38.bam ENSG00000223972.5 1 11869 12227 + 359 0 ENSG00000223972.5 1 12613 12721 + 109 0 ENSG00000223972.5 1 13221 14409 + 1189 1 ENSG00000223972.5 1 12010 12057 + 48 0 ENSG00000223972.5 1 12179 12227 + 49 0 ENSG00000223972.5 1 12613 12697 + 85 0 ENSG00000223972.5 1 12975 13052 + 78 1 ENSG00000223972.5 1 13221 13374 + 154 0

deb0612 commented 1 year ago

I tried to reformat my count file, however there are other problem

honzee commented 1 year ago

Could you test it in console using the wrapper function - we might get more information that way. Also, could you please send the header of one of the count files again? It seems they are not being read correctly.

Best, Jan

deb0612 commented 1 year ago

In console

header of count file

honzee commented 1 year ago

Looking again at the count data matrix, the ensemble IDs are not correctly formatted. RNAseqCNV expects unique ensemble gene ids but multiple rows have the same ensemble id.

I suggest collapsing all rows with the same ensemble id together.

Best, Jan