Open ycaspi257 opened 1 week ago
What version of AutoAnnotate are you using?
Can you please send me your framework-cytoscape.log
file found in the <user-home>/CytoscapeConfiguraiton/3
folder. That should contain the entire exception trace. And if possible please send me your session file.
Thanks!
Dear Mike,
Thank you very much for your prompt reply.
The files you requested are attached.
I am using Autoannotate V.1.4.1 with Cytoscape 3.10.2 Java 10.0.12 on Ubuntu 20.04.
You can see the problem, e.g., in the network "Left_Hemisphere_fMRI_NQ-EF". The command I was using is:
autoannotate annotate-clusterBoosted clusterAlgorithm=MCL labelColumn=EnrichmentMap::GS_DESCR maxWords=3 network=current
Waiting forward for your further help.
Best, Yaron Caspi
BTW, it was very hard, or even impossible, to find in the documentation the appropriate value for the clusterAlgorithm to put in the command instead of MCL
On 06/09/2024 00:09, Mike Kucera wrote:
What version of AutoAnnotate are you using?
Can you please send me your framework-cytoscape.log file found in the
Thanks!
— Reply to this email directly, view it on GitHubhttps://github.com/BaderLab/AutoAnnotateApp/issues/207#issuecomment-2332129031, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BLBDGVAI7KSWIXOFDKN3UWLZVB6Z3AVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZSGEZDSMBTGE. You are receiving this because you authored the thread.Message ID: @.***>
Hi, It looks like GitHub didn't attach your files. Can you please send them to me directly at mikekucera@gmail.com. Thanks.
Hi, there are two things that should help here...
1) Try updating AutoAnnotate to the latest version (currently 1.5.1). I don't get the same error with the latest version.
2) You must use a numeric column for the edgeWeightColumn
attribute. Using the 'name' column, which has type String, causes an error in clusterMaker. Try edgeWeightColumn=EnrichmentMap::similarity_coefficient
Dear Mike,
Thank you so much. After updating to version 1.5.1, it indeed seems to work.
Two more unrelated questions.
A. Is there a simple command to get the list of clustered and number of nodes they include (like the menu item used to export cluster to file)? B. Is there a way to add words to the "excluded words" list definitely. I mean, is there a file or something similar that I can edit to add several words definitely?
Best, Yaron
On 11/09/2024 22:18, Mike Kucera wrote:
Hi, there are two things that should help here...
— Reply to this email directly, view it on GitHubhttps://github.com/BaderLab/AutoAnnotateApp/issues/207#issuecomment-2343808905, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BLBDGVE4WMYC4JDSNR7UWTLZWBGLBAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBTHAYDQOJQGU. You are receiving this because you authored the thread.Message ID: @.***>
Hi Yaron, I know you are running commands but are you running this through R or python?
If you are running commands thought R or python, with regards to you first question, there isn't a simple command to get the info but what I usually do is after autoannotating the network I get the node table (I use RCy3 from R and use the function - getTableColumns) default_node_table <- getTableColumns(table= "node",network = network_suid)
with that table you can use the column __mclCluster
to get the number of nodes in the cluster and their names.
Imbedded in one of my R workflows I have:
words2ignore <- c("pid",1:10) responses <- lapply(words2ignore,function(x){ wordcloud2_url <- paste("wordcloud ignore add value=\"",x, "\" ","network=SUID:",network_suid, sep=""); commandsGET(wordcloud2_url)})
Thanks, Ruth
Dear Ruth,
Thank you so much.
I use R.
When doing it manually (at least for autoannotate), I did not find a mechanism to gets it stored. This is why I thought that there might be an excluded words file somewhere that I can just edit.
I was mainly interested in adding excluded words to the autoannotate clustering algorithm and not word cloud (to get the cluster labeling to fit my purposes).
Thank again.
Best, Yaron
On 12/09/2024 20:54, Ruth Isserlin wrote:
Hi Yaron, I know you are running commands but are you running this through R or python?
If you are running commands thought R or python, with regards to you first question, there isn't a simple command to get the info but what I usually do is after autoannotating the network I get the node table (I use RCy3 from R and use the function - getTableColumns) default_node_table <- getTableColumns(table= "node",network = network_suid)
with that table you can use the column __mclCluster to get the number of nodes in the cluster and their names.
Imbedded in one of my R workflows I have:
words2ignore <- c("pid",1:10) responses <- lapply(words2ignore,function(x){ wordcloud2_url <- paste("wordcloud ignore add value="",x, "" ","network=SUID:",network_suid, sep=""); commandsGET(wordcloud2_url)})
Thanks, Ruth
— Reply to this email directly, view it on GitHubhttps://github.com/BaderLab/AutoAnnotateApp/issues/207#issuecomment-2346208340, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BLBDGVFCTMSPXXX5ANGWOKDZWGFJLAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGIYDQMZUGA. You are receiving this because you authored the thread.Message ID: @.***>
Hi Yaron,
Autoannotate uses wordcloud to compute the labels so if you want to exclude words you have to make the change in word cloud.
There is a file in the WordCloud jar (which you can find in your CytoscapeConfiguration/3/apps/installed directory) called FlaggedWords.txt that you can add words to.
You would need to run the following commands to do it. (This is very hacky, sorry)
mv WordCloud-v3.1.4.jar WordCloud-v3.1.4.zip
create a FlaggedWords.txt file which looks like this: kegg reactome react biocarta go nci msigdb my_new_word1 my_new_word2
And then run: zip -u WordCloud-v3.1.4.zip FlaggedWords.txt
mv WordCloud-v3.1.4.zip WordCloud-v3.1.4.jar
Alternately, depending on the words, you can ask @mikekucera to add the words to distribution but often words can be very specific to the dataset or data sources you are using so we try to avoid that.
Thanks, Ruth
Dear Ruth,
Thank again. I will follow these instructions.
I was mainly referring to dataset pathway name from gene ontology, namely, GOCC, GOMF and GOBP. When working with GSEA - GSEA add these to the node names. Hence, when doing the clustering, there is a bias toward these words in the cluster name.
It might be reasonable to exclude these words (or give an option to exclude those and similar words that GSEA adds) in future distributions, since they are relatively general and not specific.
Best, Yaron
On 12/09/2024 21:26, Ruth Isserlin wrote:
Hi Yaron, Autoannotate uses wordcloud to compute the labels so if you want to exclude words you have to make the change in word cloud. There is a file in the WordCloud jar (which you can find in your CytoscapeConfiguration/3/apps/installed directory) called FlaggedWords.txt that you can add words to.
You would need to run the following commands to do it. (This is very hacky, sorry)
mv WordCloud-v3.1.4.jar WordCloud-v3.1.4.zip
create a FlaggedWords.txt file which looks like this: kegg reactome react biocarta go nci msigdb my_new_word1 my_new_word2
And then run: zip -u WordCloud-v3.1.4.zip FlaggedWords.txt
mv WordCloud-v3.1.4.zip WordCloud-v3.1.4.jar
Alternately, depending on the words, you can ask @mikekucerahttps://github.com/mikekucera to add the words to distribution but often words can be very specific to the dataset or data sources you are using so we try to avoid that.
Thanks, Ruth
— Reply to this email directly, view it on GitHubhttps://github.com/BaderLab/AutoAnnotateApp/issues/207#issuecomment-2346280015, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BLBDGVGT5CDX3X7ZHP34QX3ZWGI6TAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGI4DAMBRGU. You are receiving this because you authored the thread.Message ID: @.***>
Hi Yaron, Which geneset files are you using? Are you using the one supplied by GSEA? (word cloud weights the words based on occurrence in the network so if GOBP and GOMF are everywhere they shouldn't be coming up in the cluster tag). I don't see them coming up in my networks but I use the baderlab genesets and not the ones supplied with GSEA so I am curious if there is an issue. Thanks, Ruth
There is no global list of excluded words you can edit. The only way to do it is to modify the default list of words stored in the app jar like Ruth suggested. Excluded words are saved in the session file and can only be set on a per-network basis. If you are using R then they easiest thing to do is have a series of commands of the form wordcloud ignore add value="wordtoignore" network=current
in your script before the command to create the annotations.
Dear Ruth,
I am using C5.all.v2024.1.Hs.symbols.gmt, which is distributed with GSEA. That results in EnrichmentMap GS_DESCR mode names like https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/GOBP_ELECTRON_TRANSPORT_CHAIN and Enrichment map node names like GOBP_ELECTRON_TRANSPORT_CHAIN. And this is then taken by autoannotate to include labels that include words such as GOBP ...
Naturally, this can be removed by a python/R scripts. But working manually is cumbersome.
Best, Yaron
On 9/12/24 21:46, Ruth Isserlin wrote:
Hi Yaron, Which geneset files are you using? Are you using the one supplied by GSEA? (word cloud weights the words based on occurrence in the network so if GOBP and GOMF are everywhere they shouldn't be coming up in the cluster tag). I don't see them coming up in my networks but I use the baderlab genesets and not the ones supplied with GSEA so I am curious if there is an issue. Thanks, Ruth
— Reply to this email directly, view it on GitHubhttps://github.com/BaderLab/AutoAnnotateApp/issues/207#issuecomment-2346334012, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BLBDGVFVEFBOIQCK7OMHMV3ZWGLLJAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGMZTIMBRGI. You are receiving this because you authored the thread.Message ID: @.***>
Hi Yaron, Ok that makes sense. I forgot that is the way GSEA structures their gmt file. EM and AA are optimized for our gmt files which structures the name and description a little differently. I would recommend switching to them if you can. They are updated on a monthly basis so they are more up to date than the ones released by GSEA - https://download.baderlab.org/EM_Genesets/current_release/ - (info here - https://baderlab.org/GeneSets) Only caveat is they are only available for Human, Mouse, Rat and Woodchuck. Thanks, Ruth
Hello, I was trying to build an Autoannotate clustering from the command line using a command:
autoannotate annotate-clusterBoosted clusterAlgorithm=MCL labelColumn=EnrichmentMap::GS_DESCR maxWords=3 network=current edgeWeightColumn=name
However, I get an error message: Cannot invoke "org.baderlab.autoannotate.internal.model.AnnotationSetBuilder.getClusters()" because "this.builder" is null
Clustering using the Cytoscape Autoannotate menu works just fine. Only the command line send the error message. In addition, if I increase the similaritycutoff of the network so that fewer edges are formed, clustering from the command line or the Cytoscape Autoannotate menu were perfectly well.
What can be the source of the problem?
Best, Yaron Caspi