BaderLab / AutoAnnotateApp

The AutoAnnotate Cytoscape App finds clusters of nodes and visually annotates them with semantic labels and groups.
GNU Lesser General Public License v2.1
6 stars 1 forks source link

question? aggregated summary #183

Closed veroniquevoisin closed 1 year ago

veroniquevoisin commented 1 year ago

Just a question: What does Unique Values represent for the AutoAnnotate: Create Summary Network for the field GS_DESCR . I'm assuming that this is the gene-set description of the most significant gene-set of the cluster (fdr) and that explains that there are ties.

Screen Shot 2022-12-02 at 1 52 55 PM

mikekucera commented 1 year ago

Unique Values takes all the values for an attribute, removes the duplicates, then concatenates the results (using "," as delimiter).

For example, a set of nodes with values "a", "a", "a", "b", "b", "c" would become one string with the value "a,b,c".

That's why I added the new aggregator called Cluster Label. Instead of making a long label with all the geneset names smashed together it uses the cluster label from AA.

veroniquevoisin commented 1 year ago

Ok, Thanks! I was thinking that another field would be useful: label of most significant node (fdr) but then it makes AA not generic.

mikekucera commented 1 year ago

I was thinking that another field would be useful: label of most significant node (fdr) but then it makes AA not generic.

Its ok to have custom aggregators for EM networks. They just don't show up for non-EM networks.

This gets complicated because in a mastermap there are multiple fdr columns. Would it be ok use the label of the node that has the most significant fdr across all datasets?

veroniquevoisin commented 1 year ago

yes, that would work!

On Wed, Dec 7, 2022 at 2:14 PM Mike Kucera @.***> wrote:

I was thinking that another field would be useful: label of most significant node (fdr) but then it makes AA not generic.

Its ok to have custom aggregators for EM networks. They just don't show up for non-EM networks.

This gets complicated because in a mastermap there are multiple fdr columns. Would it be ok use the label of the node that has the most significant fdr across all datasets?

— Reply to this email directly, view it on GitHub https://github.com/BaderLab/AutoAnnotateApp/issues/183#issuecomment-1341467645, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZYHSZFGPXJSZJQP476OEDWMDO2LANCNFSM6AAAAAASSG244U . You are receiving this because you authored the thread.Message ID: @.***>

mikekucera commented 1 year ago

What if more than one node has the same most significant fdr value? Should it return a concatenated list of the names, or just pick one name at random?

veroniquevoisin commented 1 year ago

The most accurate would be a concatenated list of the names. We could test that first and observe the frequency of ties. That would be my suggestion.

On Wed, Dec 7, 2022 at 2:31 PM Mike Kucera @.***> wrote:

What if more than one node has the same most significant fdr value? Should it return a concatenated list of the names, or just pick one name at random?

— Reply to this email directly, view it on GitHub https://github.com/BaderLab/AutoAnnotateApp/issues/183#issuecomment-1341487359, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZYHS4IEINU2WMIZSONORDWMDQ2PANCNFSM6AAAAAASSG244U . You are receiving this because you authored the thread.Message ID: @.***>