Error in pipeline - Githubissues

charlottewright commented 4 years ago

Hello,

I am running this pipeline with my own data produced using direct cDNA sequencing, it runs fine until step 13 but then I get the error pasted below. The annotation I am using is in gtf format but is not from ensembl as it is for a non-model organism. The transcriptome was generated from the gtf file using gffread.

Any suggestions to get past this issue would be appreciated

Thank you!

Activating conda environment: /Nanopore/Differential_isoform_analysis/pipeline-transcriptome-de-trials/pipeline-transcriptome-de/Workspaces/pipeline-transcripto$
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .get_cds_IDX(mcols0$type, mcols0$phase) :
 The "phase" metadata column contains non-NA values for features of type
 stop_codon. This information was ignored.
'select()' returned 1:many mapping between keys and columns
Error in dmDSdata(counts = counts, samples = coldata) :
 mode(counts) %in% "numeric" is not TRUE
Calls: dmDSdata -> stopifnot
Execution halted
[Thu Aug 13 18:47:18 2020]
Error in rule de_analysis:
  jobid: 9
  output: de_analysis/results_dge.tsv, de_analysis/results_dge.pdf, de_analysis/results_dtu_gene.tsv, de_analysis/results_dtu_transcript.tsv, de_analysis/results_dtu_stageR.tsv, merged/all_counts_filtered.tsv, merged/all_gene_cou$
  conda-env: /rds-d4/project/cj107/rds-cj107-jiggins-rds/projects/eratoCortexMapping/Nanopore/Differential_isoform_analysis/pipeline-transcriptome-de-trials/pipeline-transcriptome-de/Workspaces/pipeline-transcriptome-de_phe/.snak$
  shell:
  /Nanopore/Differential_isoform_analysis/pipeline-transcriptome-de-trials/pipeline-transcriptome-de/scripts/de_analysis.R
    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

charlottewright commented 4 years ago

I have since solved this error by commenting out the strip_version function in de_analysis.R as suggested in this post https://github.com/nanoporetech/pipeline-transcriptome-de/issues/3

dkoppstein commented 4 years ago

I also ran into this error and fixed it by commenting out strip_version. It would be good to commit this in the main repo.

C-Pauli commented 3 years ago

Hi I've ran into the same issue and commenting out the strip_version has not helped. I'm running an NCBI genome cds file and annotation gff. Neither the GFF or the GTF seem to work even when grep -v genes that it has specific problems with.

"Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Warning messages: 1: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID, : some transcripts have no "transcript_id" attribute ==> their name ("tx_name" column in the TxDb object) was set to NA 2: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID, : the transcript names ("tx_name" column in the TxDb object) imported from the "transcript_id" attribute are not unique 'select()' returned 1:many mapping between keys and columns Error in dmDSdata(counts = counts, samples = coldata) : mode(counts) %in% "numeric" is not TRUE Calls: dmDSdata -> stopifnot Execution halted [Thu Sep 9 01:51:02 2021] Error in rule de_analysis: jobid: 0 output: de_analysis/results_dge.tsv, de_analysis/results_dge.pdf, de_analysis/results_dtu_gene.tsv, de_analysis/results_dtu_transcript.tsv, de_analysis/results_dtu_stageR.tsv, merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv shell:

/epi2melabs/differential-expression/pipeline-transcriptome-de/scripts/de_analysis.R

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /epi2melabs/differential-expression/.snakemake/log/2021-09-09T015044.982826.snakemake.log"

GCF_900626175.1_cs10_genomic.gff.gz

GCF_900626175.1_cs10_cds_from_genomic.fna.gz

nanoporetech / pipeline-transcriptome-de

Error in pipeline #18