taxprofiler / taxpasta

TAXnomic Profile Aggregation and STAndardisation
https://taxpasta.readthedocs.io/
Apache License 2.0
31 stars 6 forks source link

Support Metaphlan4 format #107

Closed TheOafidian closed 1 year ago

TheOafidian commented 1 year ago

The amount of rows before the profile differs between metaphlan3 and 4. Therefore taxpasta was not working on metaphlan4 output under the current circumstances. Since the header is formatted the same between both versions, a quick scan for the first field of the header could ensure the table is read from the right line in both formats.

I did not find any issues raised for this, so I just added the code I've used on my local example of to get it to work on the new format.

jfy133 commented 1 year ago

Thanks @TheOafidian !

I'll leave this up to @Midnighter to decide if he's OK with this system, however I can predict already he will ask for example test data to be added from MetaPhlAn4. Would you be able to upload some too? (Can be the one you found the issue from, but you can manually remove any sample-identification info)

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 84.33% and project coverage change: +0.17 :tada:

Comparison is base (f60efe6) 81.99% compared to head (06c1191) 82.17%.

:exclamation: Current head 06c1191 differs from pull request most recent head ef280dc. Consider uploading reports for the commit ef280dc to get more accurate results

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #107 +/- ## ========================================== + Coverage 81.99% 82.17% +0.17% ========================================== Files 106 110 +4 Lines 1594 1677 +83 Branches 281 299 +18 ========================================== + Hits 1307 1378 +71 - Misses 247 255 +8 - Partials 40 44 +4 ``` | [Impacted Files](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) | Coverage Δ | | |---|---|---| | [.../application/metaphlan/metaphlan\_profile\_reader.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL21ldGFwaGxhbi9tZXRhcGhsYW5fcHJvZmlsZV9yZWFkZXIucHk=) | `67.56% <50.00%> (-32.44%)` | :arrow_down: | | [...ucture/application/application\_service\_registry.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL2FwcGxpY2F0aW9uX3NlcnZpY2VfcmVnaXN0cnkucHk=) | `93.16% <83.33%> (+0.26%)` | :arrow_up: | | [...rc/taxpasta/infrastructure/application/\_\_init\_\_.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL19faW5pdF9fLnB5) | `100.00% <100.00%> (ø)` | | | [...pasta/infrastructure/application/ganon/\_\_init\_\_.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL2dhbm9uL19faW5pdF9fLnB5) | `100.00% <100.00%> (ø)` | | | [.../infrastructure/application/ganon/ganon\_profile.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL2dhbm9uL2dhbm9uX3Byb2ZpbGUucHk=) | `100.00% <100.00%> (ø)` | | | [...tructure/application/ganon/ganon\_profile\_reader.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL2dhbm9uL2dhbm9uX3Byb2ZpbGVfcmVhZGVyLnB5) | `100.00% <100.00%> (ø)` | | | [...ion/ganon/ganon\_profile\_standardisation\_service.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL2dhbm9uL2dhbm9uX3Byb2ZpbGVfc3RhbmRhcmRpc2F0aW9uX3NlcnZpY2UucHk=) | `100.00% <100.00%> (ø)` | | | [...a/infrastructure/application/supported\_profiler.py](https://app.codecov.io/gh/taxprofiler/taxpasta/pull/107?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL3RheHBhc3RhL2luZnJhc3RydWN0dXJlL2FwcGxpY2F0aW9uL3N1cHBvcnRlZF9wcm9maWxlci5weQ==) | `100.00% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Midnighter commented 1 year ago

Just to be sure, the extra line is a default feature of MetaPhlAn4 and not due to some option that you have set?

TheOafidian commented 1 year ago

Hey @Midnighter, don't have time to go look for an example on my work laptop right now, but indeed I've used default args. I've found in the Metaphlan4 tutorial an example of output where this new line is also present like it was in my data.

jfy133 commented 1 year ago

Some test data thanks to @apcamargo

https://github.com/taxprofiler/taxpasta/issues/111#issuecomment-1606547819