NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
467 stars 56 forks source link

agat_sp_extract_sequences.pl Could not open index file but it exists #479

Open hans-vg opened 3 months ago

hans-vg commented 3 months ago

Describe the bug When running the command, it gets towards the bottom of the process but then errors saying the it could not open the index file. This is a resulting GFF file from MAKER output.

General (please complete the following information):

To Reproduce agat_sp_extract_sequences.pl --gff highquality_set_aed03.gff --fasta JG_Nov2021_contig.fa -t cds -o highquality_set_aed03.CDS.fasta

Additional context Log file:

agat_sp_extract_sequences.pl --gff highquality_set_aed03.gff --fasta JG_Nov2021_contig.fa -t cds -o highquality_set_aed03.CDS.fasta

 ------------------------------------------------------------------------------
|   Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0                      |
|   https://github.com/NBISweden/AGAT                                          |
|   National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se         |
 ------------------------------------------------------------------------------
=> Using standard /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/agat_config.yaml config file
We will extract the cds sequences.
Reading file highquality_set_aed03.gff

                          ------ Start parsing ------                           
-------------------------- parse options and metadata --------------------------
=> Accessing the feature_levels YAML file
Using standard /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/feature_levels.yaml file
=> Attribute used to group features when no Parent/ID relationship exists (i.e common tag):
    * locus_tag
    * gene_id
=> merge_loci option deactivated
=> Machine information:
    This script is being run by perl v5.32.1
    Bioperl location being used: /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/
    Operating system being used: linux 
=> Accessing Ontology
    No ontology accessible from the gff file header!
    We use the SOFA ontology distributed with AGAT:
        /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/so.obo
    Read ontology /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/so.obo:
        4 root terms, and 2596 total terms, and 1516 leaf terms
    Filtering ontology:
        We found 1861 terms that are sequence_feature or is_a child of it.
--------------------------------- parsing file ---------------------------------
=> Number of line in file: 1615817
=> Number of comment lines: 0
=> Fasta included: No
=> Number of features lines: 1615817
=> Number of feature type (3rd column): 6
    * Level1: 1 => gene
    * level2: 1 => mRNA
    * level3: 4 => exon three_prime_UTR five_prime_UTR CDS
    * unknown: 0 => 
=> Version of the Bioperl GFF parser selected by AGAT: 3
Parsing: 100% [======================================================]D 0h03m05s
                 ------ End parsing (done in 188 second) ------                 

                           ------ Start checks ------                           
---------------------------- Check1: feature types -----------------------------
----------------------------------- ontology -----------------------------------
All feature types in agreement with the Ontology.
------------------------------------- agat -------------------------------------
AGAT can deal with all the encountered feature types (3rd column)
------------------------------ done in 0 seconds -------------------------------

------------------------------ Check2: duplicates ------------------------------
None found
------------------------------ done in 0 seconds -------------------------------

-------------------------- Check3: sequential bucket ---------------------------
Nothing to check as sequential bucket!
------------------------------ done in 0 seconds -------------------------------

--------------------------- Check4: l2 linked to l3 ----------------------------
No problem found
------------------------------ done in 1 seconds -------------------------------

--------------------------- Check5: l1 linked to l2 ----------------------------
No problem found
------------------------------ done in 1 seconds -------------------------------

--------------------------- Check6: remove orphan l1 ---------------------------
We remove only those not supposed to be orphan
None found
------------------------------ done in 0 seconds -------------------------------

------------------------- Check7: all level3 locations -------------------------
------------------------------ done in 24 seconds ------------------------------

------------------------------ Check8: check cds -------------------------------
No problem found
------------------------------ done in 0 seconds -------------------------------

----------------------------- Check9: check exons ------------------------------
No exons created
No exons locations modified
No supernumerary exons removed
No level2 locations modified
------------------------------ done in 18 seconds ------------------------------

----------------------------- Check10: check utrs ------------------------------
No UTRs created
No UTRs locations modified
No supernumerary UTRs removed
------------------------------ done in 11 seconds ------------------------------

------------------------ Check11: all level2 locations -------------------------
No problem found
------------------------------ done in 16 seconds ------------------------------

------------------------ Check12: all level1 locations -------------------------
No problem found
------------------------------ done in 2 seconds -------------------------------

---------------------- Check13: remove identical isoforms ----------------------
None found
------------------------------ done in 0 seconds -------------------------------
                  ------ End checks (done in 73 second) ------                  

Parsing Finished

------------- EXCEPTION -------------
MSG: Could not open index file JG_Nov2021_contig.fa.index: No such file or directory
STACK Bio::DB::IndexedBase::_open_index /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:678
STACK Bio::DB::IndexedBase::_index_files /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:655
STACK Bio::DB::IndexedBase::index_file /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:488
STACK Bio::DB::IndexedBase::new /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:365
STACK toplevel /home/hvasquezgross/miniforge3/envs/agat/bin/agat_sp_extract_sequences.pl:158

File listing after run: ls -alh


drwxrwxr-x 2 hvasquezgross hvasquezgross   14 Aug 29 16:51 .
drwxrwxr-x 3 hvasquezgross hvasquezgross   18 Aug 29 15:23 ..
-rw-rw-r-- 1 hvasquezgross hvasquezgross 5.4K Aug 29 17:03 highquality_set_aed03.agat.log
-rw-rw-r-- 1 hvasquezgross hvasquezgross    0 Aug 29 16:58 highquality_set_aed03.CDS.fasta
-rw-r--r-- 1 hvasquezgross hvasquezgross 279M Aug 29 15:53 highquality_set_aed03.gff
-rw-rw-r-- 1 hvasquezgross hvasquezgross 3.1G Aug 29 16:00 JG_Nov2021_contig.fa
-rw-r--r-- 1 hvasquezgross hvasquezgross 336K Aug 29 16:05 JG_Nov2021_contig.fa.index
Juke34 commented 2 months ago

No idea, this is strange. Did you try by removing the index and re-run ?