taxprofiler / taxpasta

TAXnomic Profile Aggregation and STAndardisation
https://taxpasta.readthedocs.io/
Apache License 2.0
33 stars 7 forks source link

[BUG] inconsistency formatting behaviour between standardise and merge #72

Open jfy133 opened 1 year ago

jfy133 commented 1 year ago

Is there an existing issue for this?

Problem description

I've noticed that when writiting the tutotiral, that for standardise the output header column is named as count. Whereas in merge it represents the file name.

I wonder if we should match the behaviour between the two, so both merge and standardise use the same column header format

However as this I now wonder we could even just collapse the two commands in two one... simply have standardise, which can accept one or more profiles (with if more provided, all are automatically merged...? But then someone may wish to merge themselves later on... so maybe safer as it is)

Code sample

Code run:

Traceback:

Environment

Anything else?

For example, if I merge output of standardise of one tool, and merge of another tool

taxonomy_id 2612_pe-ERR5766176-db_mOTU 2612_se-ERR5766180-db_mOTU count  
40518 20 2 NA  
216816 1 0 NA  
1680 6 1 NA  
1262820 1 0 NA  
74426 2 1 NA  
1907654 1 0 NA  
1852370 3 1 NA  
39491 3 0 NA  
33039 2 0 NA  
39486 1 0 NA  

Where coutn was from a stadnarise on kraken output

Midnighter commented 1 year ago

When using merge, each column represents one profile/sample. I don't see how in a wide table you would do this? Certainly in the long (tidy) format, you have only three columns: taxonomy_id, count, and sample.

jfy133 commented 1 year ago

But in standardise

image

It is by default wide, so still works, you just rename count to the file name ?

And if it's long, it's still a single extra column with a single-entry of the sample name

Midnighter commented 1 year ago

Oh, that's how you mean. I guess, in this case wide/long are actually the same 😆 We could offer the wide/long option, though, and then change output accordingly. You think that would be better/more consistent?

jfy133 commented 1 year ago

I think so, see the issue in the tutorial for an example in #66 :)