vinisalazar / metaphor

Metaphor: a general-purpose workflow for assembly and binning of metagenomes
https://metaphor-workflow.readthedocs.io/
Other
37 stars 3 forks source link

Weird taxa abundance plots #47

Closed camilogarciabotero closed 1 year ago

camilogarciabotero commented 1 year ago

Hey V,

So, I sent my data to metaphor and it finished apparently well. I then went to inspect the plots from the annotation/cog/cobinning, there I got several abundance tables of the form:

genus   sscinames       CRR404407       CRR404408       CRR404409       CRR404410       CRR404411       CRR404412
        49884036.0      0.0092  0.0089  0.0105  0.0093  0.0088  0.0084
Acaryochloris   329726.0        0.0049  0.005   0.005   0.0049  0.0047  0.0058
Acetivibrio     203119.0        0.0004  0.0004  0.0003  0.0005  0.0005  0.0005
Acetoanaerobium 1511.0  0.0002  0.0002  0.0001  0.0002  0.0002  0.0001
Acetobacter     634452.0        0.0003  0.0004  0.0002  0.0002  0.0002  0.0002
Acetobacterium  931626.0        0.0002  0.0001  0.0002  0.0001  0.0001  0.0001
Acetohalobium   574087.0        0.0003  0.0004  0.0004  0.0003  0.0003  0.0003
Acetomicrobium  891968.0        0.0004  0.0005  0.0005  0.0005  0.0005  0.0004
Acholeplasma    441768.0        0.0     0.0     0.0     0.0     0.0     0.0
...

As you can see apparently the line after the header is having a blank space for the genus column. The result might be affecting the plot as it generate a single column plot, or it might be a problem with the sscinameswhich is being used instead and therefore pushing the values to what is actually an ID:

image

Any thoughts on this? By the way, this is happening at all levels of the taxa abundances. Best. Camilo.

vinisalazar commented 1 year ago

Hi @camilogarciabotero,

Thank you for reporting this. I believe this happens in some versions of Pandas as they read the first column of floats (49884036.0, 329726.0, ...) as abundances when they are actually integers representing the TaxID of each taxa in your sample.

I will be pushing a patch soon to fix this, then you if you update your install and delete your annotation/cog directory and re-run, the pipeline should be able to resume from there (instead of running everything again).

Thank you for your patience! Vini

vinisalazar commented 1 year ago

Hi @camilogarciabotero,

I've released a version 1.7.7 which hopefully should fix the problems you were experiencing. It might be a couple of days until the conda installation is updated.

Please let me know if you continue to have problems after that. Your feedback is invaluable.

Best, Vini

vinisalazar commented 1 year ago

@camilogarciabotero v1.7.7 is now live in Bioconda. If you could let me know if your plots look ok, that would be great.

Thanks

camilogarciabotero commented 1 year ago

Thanks V,

I am currently running it, I'll let you know how it goes. Thank you for solving the issue that quick!

camilogarciabotero commented 1 year ago

Hey V,

I found an error with pandas library:

.miniconda3/envs/metaphor/lib/python3.11/site-packages/conda_package_handli
ng/api.py:29: UserWarning: Install zstandard Python bindings for .conda support
  _warnings.warn("Install zstandard Python bindings for .conda support")
Could not solve for environment specs
The following package could not be installed
└─ pandas >=2  does not exist (perhaps a typo or a missing channel).

Any thoughts on it?

vinisalazar commented 1 year ago

Hm, I hadn't come across that one before.

I believe if you run conda install -n metaphor zstandard -c conda-forge, i.e. install zstandard on your Metaphor environment, that should be fixed.

Please let me know how it goes.

Thank you, Vini

camilogarciabotero commented 1 year ago

It says it is already installed...

vinisalazar commented 1 year ago

Could you please try installing it in your base environment instead?

vinisalazar commented 1 year ago

@camilogarciabotero I am going to go ahead and close this issue as I believe it has been fixed as of v1.7.7, but please don't hesitate to reopen it if you continue to have problems.

Thank you.