TreeScaper / TreeScaper

2 stars 0 forks source link

CLVTreeScaper outputs always identified with "trees" even for bipartitions #4

Open tmcgowan opened 4 years ago

tmcgowan commented 4 years ago

Running -o Covariance or -o Community -ft Cova -t covariance produces matrices indicating they are describing trees, but are I believe, describing bi-partitions. From the community results file: Community index (first column is tree index): From the covariance matrix: tree 1 2 3 ...

Since the visualization JS code only sees the output and not the command line arguments, this makes it very hard to correctly process CLVTreeScaper output.

tmcgowan commented 4 years ago

Attached are two CD outputs. Community_Results-trees.txt output was run with the boot trees as input -ft Trees -o Community -w 0 -r 0 -t Affinity -cm CPM -lm auto Community_Results-bipartitions.txt was run with the boot trees covariance matrix as input -ft Cova -o Community -w 0 -r 0 -t Covariance -cm CPM -lm auto

Community_Results-bipartions.txt Community_Results-trees.txt

@jembrown @btoup15 @kagallivan @zhifeng1703 @jwilgenb

zhifeng1703 commented 4 years ago

@tmcgowan

I would like to make sure I understand your comments correctly. You are saying that the treescaper on CD part works fine but output file does not distinguish the members of communities and assumes they are all trees.

If this is the case, I can modify the output format now. Do you prefer using filename as a flag or additional lines inside the file? If this is not what you mean, let me know what is needed here.

btw, when I was browsing through these outputs, they are outputting every tree/bi-partition with the label of the community it lies in. Theoretically it contains all information we need but technically it is inefficient and possibly inconvenient. For example, instead of outputting member-by-member, we could output every community with all members lied with-in. If any of these pre-processed data helps on your side of processing and displaying results, let us know and we can make additional output files.

tmcgowan commented 4 years ago

@zhifeng1703

A couple of things:

  1. The opened issue is about the contents of the files. I have attached two output examples. One was generated from trees the other generated from a covariance matrix. Both contain the text Community index (first column is tree index): in the file. This is confusing to the user and is not factually correct.

  2. Output naming across TreeScaper is very inconsistent, I think we have talked about naming in a call or two. I am thinking about a more general solution and will propose one shortly.

tmcgowan commented 4 years ago

See #6 as possible solution.

jembrown commented 4 years ago

@zhifeng1703 It seems like this issue, regarding the incorrect labeling of output from bipartition-based analyses, is fixed now, right? If so, we can close this.

tmcgowan commented 4 years ago

@jembrown @zhifeng1703 the issue is against the master branch, so we really need to get the branch code merged.

jembrown commented 4 years ago

@tmcgowan Ah, ok. I was thinking this was fixed by the merge on July 31st, but it sounds like that's not the case? @zhifeng1703