etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
520 stars 163 forks source link

AssertionError import-rna #706

Open Argonvi opened 2 years ago

Argonvi commented 2 years ago

Dear CNVkit developers,

I am trying to get import-rna to work but I am getting an error. My command was

cnvkit.py import-rna --format rsem expression/*/*.genes.results --output cnvkit_summary.tsv --output-dir cnvkit_outdir --gene-resource ensembl-gene-info.hg38.tsv

This is the error:

Dropping 19706 / 39332 rarely expressed genes from input samples
Loading gene metadata
Loaded ensembl-gene-info.hg38.tsv with shape: (221323, 9)
Trimmed gene info table to shape: (63966, 9)
Aligning gene info to sample gene counts
Weighting genes with below-average read counts
Calculating normalized gene read depths
Traceback (most recent call last):
  File "/home/agonzalez/.conda/envs/cnvkit/bin/cnvkit.py", line 9, in <module>
    args.func(args)
  File "/home/agonzalez/.local/lib/python3.9/site-packages/cnvlib/commands.py", line 1546, in _cmd_import_rna
    all_data, cnrs = import_rna.do_import_rna(
  File "/home/agonzalez/.local/lib/python3.9/site-packages/cnvlib/import_rna.py", line 38, in do_import_rna
    gene_info, sample_counts, sample_data_log2 = rna.align_gene_info_to_samples(
  File "/home/agonzalez/.local/lib/python3.9/site-packages/cnvlib/rna.py", line 270, in align_gene_info_to_samples
    sample_depths_log2 = normalize_read_depths(sc.divide(gi['tx_length'],
  File "/home/agonzalez/.local/lib/python3.9/site-packages/cnvlib/rna.py", line 308, in normalize_read_depths
    assert sample_depths.values.sum() > 0
AssertionError

I am using 36 samples, quantified with RSEM v1.3.1. I also tried with a different gene resource that I made (with Biomart) from the gtf that was used to quantify gene expression, but the result was the same. Additionally, I have created a correlations file from TCGA-COAD data that I downloaded from cBioportal, but have not used it yet given that I am getting this error.

Here I attach one of the quantification files: FER02.genes.results.txt

Please help me solve this issue. Thanks in advance, Arturo

tetedange13 commented 2 years ago

Hi @Argonvi ,

Could be related to issue #499 (and possibly to #596 but less sure) => So my wild guess is: try to add --format rsem option to your command-line ? => You can also try different "gene ressources" (even if I think this is not the root cause of your issue)

Hope this helps, Kind regards, Felix.

Argonvi commented 2 years ago

Hi @tetedange13,

I have tried both of your recommendations, but the result stays the same.