honzee / RNAseqCNV

R package for large-scale CNV analysis from RNA-seq
MIT License
11 stars 8 forks source link

counts matrix should be numeric, currently it has mode: logical #13

Closed xiucz closed 2 years ago

xiucz commented 2 years ago

Hi,

Thank you for your tool , and I want to have a test, is it enought to use 7 samples? How to debug it ? I need your advice, thanks a lot.

> RNAseqCNV_wrapper(config = "1366-2.cfg", metadata = "1366-2.metadata", snv_format = "vcf", genome_version = "hg19", batch =  FALSE, standard_samples =c("1366-2"))

FALSError in DESeqDataSet(se, design = design, ignoreRank) :
  counts matrix should be numeric, currently it has mode: logical

And one of my sample counts file:

$ head  count_files/1366-2.count
ENSG00000000003.15_6    1.00
ENSG00000000005.6_5     2.00
ENSG00000000419.13_7    600.00
ENSG00000000457.14_9    422.24
ENSG00000000460.17_8    303.76
ENSG00000000938.13_9    946.00
ENSG00000000971.16_5    24.00
ENSG00000001036.14_7    207.00
ENSG00000001084.13_13   222.00

Best, xiucz

honzee commented 2 years ago

Hi,

thank you for trying out RNAseqCNV! I hope we can get it working for you quickly.

I think that the problem lies in the input count files. Please check out the example in the readme (https://github.com/honzee/RNAseqCNV#count_files). The proper format should be an ensemble gene name tab separated by the read number. From the example you provided, it seems that there is a different gene description and read numbers with decimal points in them, which is unexpected.

Could you perhaps reformat the count files in a way that would fit RNAseqCNV?

Best, Jan

xiucz commented 2 years ago

Actually, I prepared the count file from RSEM result,

$ cut -f 1,5 $rsemcount |sed '1d' >$outdir/count_files/$name.count

I think I should reprepare the count file from HTSEQ ? And I have two more questions ,

  1. Is it suitable to use the result from RSEM software without changing my quantitative software?
  2. The first column of example count file is an ensemble gene name without versionENSG00000000003., but my file have version ENSG00000000003.15_6 . Should I remove the .15_6 suffix?

Best, xiucz

honzee commented 2 years ago

I think, that RSEM should note pose a problem, perhaps with slight adjustments.

  1. Yes. I think the results from HTseq and RSEM should be relatively comparable. Just make sure that you are using the RSEM read counts.
  2. Yes, please remove the suffix - otherwise, the inbuilt reference will not work.

If you encounter some other issues, please let me know, and I am sure we will solve them.

Best, Jan

xiucz commented 2 years ago

@honzee Thank you!

It works well now! I will close this issue since my problem solved , and I will feedback if I have other issues.

Best, xiucz