Open LCFortier opened 2 weeks ago
At which point did it fail? Can you share the full log?
(genomad) forl1705@UN05FMSS508020 geNomad % genomad end-to-end --cleanup --splits 8 R20291_NC013316.fas R20291_genomad_output genomad_db
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Executing geNomad annotate (v1.8.0). This will perform gene calling in the input sequences and annotate the predicted proteins with geNomad's markers. │
│ ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── │
│ Outputs: │
│ R20291_genomad_output/R20291_NC013316_annotate │
│ ├── R20291_NC013316_annotate.json (execution parameters) │
│ ├── R20291_NC013316_genes.tsv (gene annotation data) │
│ ├── R20291_NC013316_taxonomy.tsv (taxonomic assignment) │
│ ├── R20291_NC013316_mmseqs2.tsv (MMseqs2 output file) │
│ └── R20291_NC013316_proteins.faa (protein FASTA file) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[21:36:24] Executing genomad annotate.
[21:36:24] Creating the R20291_genomad_output/R20291_NC013316_annotate directory.
[21:36:33] Proteins predicted with pyrodigal-gv were written to R20291_NC013316_proteins.faa.
[21:38:37] Proteins annotated with MMseqs2 and geNomad database (v1.7) were written to R20291_NC013316_mmseqs2.tsv.
[21:38:37] Deleting R20291_NC013316_mmseqs2.
[21:38:38] Gene data was written to R20291_NC013316_genes.tsv.
[21:38:38] Taxonomic assignment data was written to R20291_NC013316_taxonomy.tsv.
[21:38:38] geNomad annotate finished!
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Executing geNomad find-proviruses (v1.8.0). This will find putative proviral regions within the input sequences. │
│ ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── │
│ Outputs: │
│ R20291_genomad_output/R20291_NC013316_find_proviruses │
│ ├── R20291_NC013316_find_proviruses.json (execution parameters) │
│ ├── R20291_NC013316_provirus.tsv (provirus data) │
│ ├── R20291_NC013316_provirus.fna (provirus nucleotide sequences) │
│ ├── R20291_NC013316_provirus_proteins.faa (provirus protein sequences) │
│ ├── R20291_NC013316_provirus_genes.tsv (provirus gene annotation data) │
│ ├── R20291_NC013316_provirus_taxonomy.tsv (provirus taxonomic assignment) │
│ ├── R20291_NC013316_provirus_mmseqs2.tsv (MMseqs2 output file) │
│ └── R20291_NC013316_provirus_aragorn.tsv (Aragorn output file) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[21:38:38] Executing genomad find-proviruses.
[21:38:38] Creating the R20291_genomad_output/R20291_NC013316_find_proviruses directory.
[21:38:40] Integrases identified with MMseqs2 and geNomad database (v1.7) were written to R20291_NC013316_provirus_mmseqs2.tsv.
[21:38:40] Deleting R20291_NC013316_provirus_mmseqs2.
[21:38:40] Deleting R20291_NC013316_provirus_mmseqs2_input.faa.
[21:38:47] tRNAs identified with Aragorn were written to R20291_NC013316_provirus_aragorn.tsv.
[21:38:47] Deleting R20291_NC013316_provirus_aragorn_input.fna.
[21:38:48] Provirus regions identified.
[21:38:48] Provirus data was written to R20291_NC013316_provirus.tsv.
[21:38:48] Provirus nucleotide sequences were written to R20291_NC013316_provirus.fna.
[21:38:48] Provirus protein sequences were written to R20291_NC013316_provirus_proteins.faa.
[21:38:48] Provirus gene data was written to R20291_NC013316_provirus_genes.tsv.
[21:38:48] Taxonomic assignment data was written to R20291_NC013316_provirus_taxonomy.tsv.
[21:38:48] geNomad find-proviruses finished!
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Executing geNomad marker-classification (v1.8.0). This will classify the input sequences into chromosome, plasmid, or virus based on the presence of │
│ geNomad markers and other gene-related features. │
│ ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── │
│ Outputs: │
│ R20291_genomad_output/R20291_NC013316_marker_classification │
│ ├── R20291_NC013316_marker_classification.json (execution parameters) │
│ ├── R20291_NC013316_features.tsv (sequence feature data: tabular format) │
│ ├── R20291_NC013316_features.npz (sequence feature data: binary format) │
│ ├── R20291_NC013316_marker_classification.tsv (sequence classification: tabular format) │
│ ├── R20291_NC013316_marker_classification.npz (sequence classification: binary format) │
│ ├── R20291_NC013316_provirus_features.tsv (provirus feature data: tabular format) │
│ ├── R20291_NC013316_provirus_features.npz (provirus feature data: binary format) │
│ ├── R20291_NC013316_provirus_marker_classification.tsv (provirus classification: tabular format) │
│ └── R20291_NC013316_provirus_marker_classification.npz (provirus classification: binary format) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[21:38:48] Executing genomad marker-classification.
[21:38:48] Creating the R20291_genomad_output/R20291_NC013316_marker_classification directory.
[21:38:49] Sequence features computed.
[21:38:49] Sequence features in binary format written to R20291_NC013316_features.npz.
[21:38:49] Sequence features in tabular format written to R20291_NC013316_features.tsv.
[21:38:49] Provirus features computed.
[21:38:49] Provirus features in binary format written to R20291_NC013316_provirus_features.npz.
[21:38:49] Provirus features in tabular format written to R20291_NC013316_provirus_features.tsv.
[21:38:49] Sequences classified.
[21:38:49] Sequence classification in binary format written to R20291_NC013316_marker_classification.npz.
[21:38:49] Sequence classification in tabular format written to R20291_NC013316_marker_classification.tsv.
[21:38:49] Proviruses classified.
[21:38:49] Provirus classification in binary format written to R20291_NC013316_provirus_marker_classification.npz.
[21:38:49] Provirus classification in tabular format written to R20291_NC013316_provirus_marker_classification.tsv.
[21:38:49] geNomad marker-classification finished!
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Executing geNomad nn-classification (v1.8.0). This will classify the input sequences into chromosome, plasmid, or virus based on the nucleotide sequence. │
│ ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── │
│ Outputs: │
│ R20291_genomad_output/R20291_NC013316_nn_classification │
│ ├── R20291_NC013316_nn_classification.json (execution parameters) │
│ ├── R20291_NC013316_encoded_sequences (directory containing encoded sequence data) │
│ ├── R20291_NC013316_nn_classification.tsv (contig classification: tabular format) │
│ ├── R20291_NC013316_nn_classification.npz (contig classification: binary format) │
│ ├── R20291_NC013316_encoded_proviruses (directory containing encoded sequence data) │
│ ├── R20291_NC013316_provirus_nn_classification.tsv (provirus classification: tabular format) │
│ └── R20291_NC013316_provirus_nn_classification.npz (provirus classification: binary format) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[21:40:07] Executing genomad nn-classification.
[21:40:07] Creating the R20291_genomad_output/R20291_NC013316_nn_classification directory.
[21:40:07] Creating the R20291_genomad_output/R20291_NC013316_nn_classification/R20291_NC013316_encoded_sequences directory.
[21:40:08] Encoded sequence data written to R20291_NC013316_encoded_sequences.
[21:40:08] Creating the R20291_genomad_output/R20291_NC013316_nn_classification/R20291_NC013316_encoded_proviruses directory.
[21:40:08] Encoded provirus data written to R20291_NC013316_encoded_proviruses.
Classifying sequences ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% | -:--:--libc++abi: terminating due to uncaught exception of type Xbyak::Error: x2APIC is not supported
zsh: abort genomad end-to-end --cleanup --splits 8 R20291_NC013316.fas genomad_db
(genomad) forl1705@UN05FMSS508020 geNomad %
It seems that this failed on the neural network step, which could be because of a incompatibility between TensorFlow (the deep learning library geNomad uses) and the processor's architecture (see here). Do you know if you are running this natively or through Rosetta?
Looking at the link you sent, it is probably the same issue related to the M1 chip. I have installed miniforge3 with the arm64 achitecture. I suppose I have to follow the same steps as indicated in the thread you cited.
It seems that Conda (or Mamba) creates native environments by default and that the conda-forge version of TensorFlow is not really working on ARM chips. You may try:
Last case scenario, in case you are in a rush to get these results, you can disable the neural-network branch using the --disable-nn-classification
parameter.
OK thanks for the suggestions, I am not in a rush so I'll try it tomorrow and will let you know if it worked. Thanks for your quick reply!
Hi,
I have run geNomad and it seemed to run smoothly until it aborted due to uncaught exception (see below).
libc++abi: terminating due to uncaught exception of type Xbyak::Error: x2APIC is not supported
I have installed geNomad with Conda on a MacbookPro M1.
Any idea of what could have gone wrong?
Thanks!