taxprofiler / taxpasta

TAXnomic Profile Aggregation and STAndardisation
https://taxpasta.readthedocs.io/
Apache License 2.0
34 stars 7 forks source link

Variability in KrakenUniq output causes bugs #57

Closed jfy133 closed 1 year ago

jfy133 commented 1 year ago

Tried running taxpasta 1.1 with a command that included the files in the attached zip, and it errored out.

@sofstam suspects it's due to the CI test file having a slightly different structure of the contents

Command executed:

  taxpasta merge \
      -p krakenuniq -o krakenuniq_krakenuniq-db.tsv \
       \
       \
      MOCK_001_Illumina_Hiseq_3000_1.krakenuniq.report.txt MOCK_002_Illumina_Hiseq_3000_1.krakenuniq.report.txt MOCK_003_Illumina_Hiseq_3000.krakenuniq.report.txt MOCK_001_Minion_R9_1.krakenuniq.report.txt MOCK_002_Minion_R9_1.krakenuniq.report.txt MOCK_003_Minion_R9_1.krakenuniq.report.txt

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_TAXPROFILER:TAXPROFILER:STANDARDISATION_PROFILES:TAXPASTA_MERGE":
      taxpasta: $(taxpasta --version)
  END_VERSIONS

Command exit status:
  1

Command output:
  [17:26:39] CRITICAL Error in sample                                 merge.py:331
                      'MOCK_001_Illumina_Hiseq_3000_1.krakenuniq.repo             
                      rt' with profile                                            
                      'MOCK_001_Illumina_Hiseq_3000_1.krakenuniq.repo             
                      rt.txt'.                                                    
             CRITICAL      schema_context column  ...  failure_case   merge.py:334
                      index                                                       
                      0   DataFrameSchema   None  ...         3.418               
                      None                                                        
                      1   DataFrameSchema   None  ...         15525               
                      None                                                        
                      16  DataFrameSchema   None  ...          rank               
                      None                                                        
                      15  DataFrameSchema   None  ...         taxID               
                      None                                                        
                      14  DataFrameSchema   None  ...           cov               
                      None                                                        
                      13  DataFrameSchema   None  ...           dup               
                      None                                                        
                      12  DataFrameSchema   None  ...         kmers               
                      None                                                        
                      11  DataFrameSchema   None  ...      taxReads               
                      None                                                        
                      10  DataFrameSchema   None  ...         reads               
                      None                                                        
                      9   DataFrameSchema   None  ...             %               
                      None                                                        
                      8   DataFrameSchema   None  ...  unclassified               
                      None                                                        
                      7   DataFrameSchema   None  ...       no rank               
                      None                                                        
                      6   DataFrameSchema   None  ...             0               
                      None                                                        
                      5   DataFrameSchema   None  ...            NA               
                      None                                                        
                      4   DataFrameSchema   None  ...          1.17               
                      None                                                        
                      3   DataFrameSchema   None  ...       5226988               
                      None                                                        
                      2   DataFrameSchema   None  ...       15525.1               
                      None                                                        
                      17  DataFrameSchema   None  ...       taxName               
                      None                                                        

                      [18 rows x 6 columns]                                       

taxpasta_ku_error.zip