z0on / GO_MWU

Rank-based Gene Ontology analysis of gene expression data
37 stars 17 forks source link

GO_MWU.R error #7

Closed Ruiqi-CUB closed 3 years ago

Ruiqi-CUB commented 3 years ago

Hello Dr. Matz,

I was trying to run the GO_MWU.R but ended up with an error at the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision,
    perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already
    largest=0.1,  # a GO category will not be considered if it contains more than this fraction of the total number of genes
    smallest=5,   # a GO category should contain at least this many genes to be considered
    clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.
#   Alternative="g" # by default the MWU test is two-tailed; specify "g" or "l" of you want to test for "greater" or "less" instead. 
#   Module=TRUE,Alternative="g" # un-remark this if you are analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)
#   Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA module 
)

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest)  : 0.1
         smallest GO category as # of genes (smallest)  : 5
                clustering threshold (clusterCutHeight) : 0.25

-----------------
retrieving GO hierarchy, reformatting data...

-------------
go_reformat:
Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0
-------------
-------------
go_nrify:
0 categories, 0 genes; size range 5-0
    0 too broad
    0 too small
    0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

 Error in read.table(inname, sep = "\t", header = T, check.names = F) : 
  no lines available in input 

I checked the format of my input files but they look fine to me. image

Would you mind having a look? Thank you so much!

z0on commented 3 years ago

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify "g" or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA module

)

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB commented 3 years ago

Oh right! Thanks! I just remember that I had to add i_1 in one of the intermediate step for annotation.

I will try it!

On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz notifications@github.com wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify "g"

or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are analyzing a

SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA

module )

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] < https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741550246, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB commented 3 years ago

Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test?

Best Ruiqi

On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz notifications@github.com wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify "g"

or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are analyzing a

SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA

module )

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] < https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741550246, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on commented 3 years ago

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough.

I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it?

cheers Misha

On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB notifications@github.com wrote:

Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test?

Best Ruiqi

On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz notifications@github.com wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify "g"

or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are analyzing a

SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA

module )

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] <

https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741550246, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741960235, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

Hi Misha,

I really appreciate your help! I ended up running gowuStats on a server and then I downloaded the intermediate files on my laptop. It worked well.

Ruiqi

On Wed, Dec 9, 2020 at 12:10 PM Mikhail V Matz notifications@github.com wrote:

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough.

I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it?

cheers Misha

On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB notifications@github.com wrote:

Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test?

Best Ruiqi

On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz <notifications@github.com

wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify

"g" or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are analyzing a

SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA

module )

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] <

https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741550246, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741960235, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741985636, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIR4SFMWSO4LDJCKTNLST7DLBANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB commented 3 years ago

Hi Misha, are suggestions if I have too many GO terms (~70) on the final figure? I have already used the "strict" cutoffs. Thanks a lot

Ruiqi

Ruiqi-CUB commented 3 years ago

Hi Misha,

I got a lot of go terms (>50) in the figure even if I use some very strict cutoff values for -logP (level1=0.01, level2=0.001, level3=0.0001). Would you mind giving me some suggestions? I noticed that you tend to have very few GO terms in your GO MWU figures in your papers. Did you do any filtration?

Thank you so much for your help! Here is an example of my GO MWU figure. [image: image.png]

Best Ruiqi

On Wed, Dec 9, 2020 at 12:10 PM Mikhail V Matz notifications@github.com wrote:

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough.

I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it?

cheers Misha

On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB notifications@github.com wrote:

Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test?

Best Ruiqi

On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz <notifications@github.com

wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify

"g" or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are analyzing a

SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA

module )

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] <

https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741550246, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741960235, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741985636, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIR4SFMWSO4LDJCKTNLST7DLBANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on commented 3 years ago

Hi Ruiqi - this is possible, I have seen that. but LAO can be a mistake in setting up options or the measures table. Can you please post the resulting figure? (crank down the levels another 10-fold if you must) Misha

On Thu, Dec 17, 2020 at 7:04 PM Ruiqi-CUB notifications@github.com wrote:

Hi Misha,

I got a lot of go terms (>50) in the figure even if I use some very strict cutoff values for -logP (level1=0.01, level2=0.001, level3=0.0001). Would you mind giving me some suggestions? I noticed that you tend to have very few GO terms in your GO MWU figures in your papers. Did you do any filtration?

Thank you so much for your help! Here is an example of my GO MWU figure. [image: image.png]

Best Ruiqi

On Wed, Dec 9, 2020 at 12:10 PM Mikhail V Matz notifications@github.com wrote:

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough.

I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it?

cheers Misha

On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB notifications@github.com wrote:

Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test?

Best Ruiqi

On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz < notifications@github.com

wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB notifications@github.com wrote:

Hello Dr. Matz,

I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.

Alternative="g" # by default the MWU test is two-tailed; specify

"g" or "l" of you want to test for "greater" or "less" instead.

Module=TRUE,Alternative="g" # un-remark this if you are

analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)

Module=TRUE # un-remark this if you are analyzing an UNSIGNED

WGCNA module )

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25


retrieving GO hierarchy, reformatting data...


go_reformat: Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0


go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input

I checked the format of my input files but they look fine to me. [image: image] <

https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png

Would you mind having a look? Thank you so much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741550246, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741960235, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-741985636, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEILIR4SFMWSO4LDJCKTNLST7DLBANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747801782, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGEH5BSW4LI3UTCKLW3SVKS75ANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 image

Here is the figure with another 10-fold down but it still looks messy. image

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

image

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 image

Here is the R code in GO_MWU.R image

z0on commented 3 years ago

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB notifications@github.com wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz notifications@github.com wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB notifications@github.com wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] < https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] < https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] < https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] < https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] < https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on commented 3 years ago

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB notifications@github.com wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz notifications@github.com wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB notifications@github.com wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB commented 3 years ago

No.. they are cockles from subfamily Fraginae, which does not have good annotations.

On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz notifications@github.com wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB notifications@github.com wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz <notifications@github.com

wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB notifications@github.com wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747872670, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB commented 3 years ago

Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results?

Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz notifications@github.com wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB notifications@github.com wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz <notifications@github.com

wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB notifications@github.com wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747872670, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on commented 3 years ago

That is correct - There must be no cutoffs. The input file must list all genes, -log10p for each. Can you maybe send me your annotations file and the input file? Also, what is the experiment? (If I may ask)

On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB notifications@github.com wrote:

Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results?

Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz notifications@github.com wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB notifications@github.com wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < notifications@github.com

wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB notifications@github.com wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747872670, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747915739, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB commented 3 years ago

Thanks! May I send them to your personal email?

On Fri, Dec 18, 2020 at 6:02 AM Mikhail V Matz notifications@github.com wrote:

That is correct - There must be no cutoffs. The input file must list all genes, -log10p for each. Can you maybe send me your annotations file and the input file? Also, what is the experiment? (If I may ask)

On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB notifications@github.com wrote:

Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results?

Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz < notifications@github.com> wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB notifications@github.com wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < notifications@github.com

wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB < notifications@github.com> wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747872670, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747915739, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748072251, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILISABPRBJETI3TD3V3DSVNHDNANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB commented 3 years ago

Hi Misha, I just send you all the files to your email. Please let me know if you can access them! Thanks a lot!

z0on commented 3 years ago

sure!

On Fri, Dec 18, 2020 at 8:44 AM Ruiqi-CUB notifications@github.com wrote:

Thanks! May I send them to your personal email?

On Fri, Dec 18, 2020 at 6:02 AM Mikhail V Matz notifications@github.com wrote:

That is correct - There must be no cutoffs. The input file must list all genes, -log10p for each. Can you maybe send me your annotations file and the input file? Also, what is the experiment? (If I may ask)

On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB notifications@github.com wrote:

Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results?

Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz < notifications@github.com> wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB <notifications@github.com

wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < notifications@github.com

wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB < notifications@github.com> wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747872670, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747915739, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748072251, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEILISABPRBJETI3TD3V3DSVNHDNANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748121555, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGGFUV4KR5SZK7ZG7QTSVNTGBANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

Thanks a lot! I have sent them to you via Google Drive and gmail. There is one thought that came across to me: since I am using -log10P, the logFC won't impact the results in the rank test. For example, gene 1 (logFC=0.4, -log10P=10) and gene 2 (logFC=5, -log10P=10) would have the same power in the test. If I believe that gene with log2FC<1 is not really differentially expressed. Should I set up a cutoff at logFC<1 for the DE results, then perform GO_MWU?

Best Ruiqi

On Fri, Dec 18, 2020 at 10:37 AM Mikhail V Matz notifications@github.com wrote:

sure!

On Fri, Dec 18, 2020 at 8:44 AM Ruiqi-CUB notifications@github.com wrote:

Thanks! May I send them to your personal email?

On Fri, Dec 18, 2020 at 6:02 AM Mikhail V Matz <notifications@github.com

wrote:

That is correct - There must be no cutoffs. The input file must list all genes, -log10p for each. Can you maybe send me your annotations file and the input file? Also, what is the experiment? (If I may ask)

On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB notifications@github.com wrote:

Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results?

Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz < notifications@github.com> wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB < notifications@github.com

wrote:

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU?

Ruiqi

On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < notifications@github.com

wrote:

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB < notifications@github.com> wrote:

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png

Here is the figure with another 10-fold down but it still looks messy. [image: image] <

https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

[image: image] <

https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <

https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png

Here is the R code in GO_MWU.R [image: image] <

https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/z0on/GO_MWU/issues/7#issuecomment-747839127 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/z0on/GO_MWU/issues/7#issuecomment-747853112 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub <https://github.com/z0on/GO_MWU/issues/7#issuecomment-747863005 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747872670, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-747915739, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748072251, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALEILISABPRBJETI3TD3V3DSVNHDNANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748121555, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGGFUV4KR5SZK7ZG7QTSVNTGBANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748222795, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIUZOF4HCCXT5RUNGFTSVOHMLANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on commented 3 years ago

Hi Ruiqi - damn, it looks real to me. Amazing dataset! You somehow have very extensive annotations, that gives you extra power. Where did you get the annotations from? (there is a bunch of "obsolete" terms in it, maybe re-annotate?)

Also there is the last part of GO_MWU.R that gives you "best GOs" representing independent groups of GO terms - use that to summarize your super-extensive GO list? (I just pushed the commit correcting a minor bug there :)

Misha

On Fri, Dec 18, 2020 at 10:19 AM Ruiqi-CUB notifications@github.com wrote:

Hi Misha, I just send you all the files to your email. Please let me know if you can access them! Thanks a lot!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748184171, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGGPZIXCYE3UUELY2DTSVN6JLANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

I annotated it with eggnog mapper v2.0.1b with the database download by the script coming with it using the following command. python2.7 emapper.py -m diamond --output_dir ../ --translate -o unigene_fragum --cpu 8 -i ../unigene_fragum.fna

Is there a better way to annotate them? Should I use interproscan instead? Do you happen to know why the Go terms are "obsolete"? The databases should be updated.

As for the bestGOs, is it reasonable to just plot the best GOs? How can I plot it with the plot command within the GO_MWU?

Thank you so much for your help! Ruiqi

On Fri, Dec 18, 2020 at 12:13 PM Mikhail V Matz notifications@github.com wrote:

Hi Ruiqi - damn, it looks real to me. Amazing dataset! You somehow have very extensive annotations, that gives you extra power. Where did you get the annotations from? (there is a bunch of "obsolete" terms in it, maybe re-annotate?)

Also there is the last part of GO_MWU.R that gives you "best GOs" representing independent groups of GO terms - use that to summarize your super-extensive GO list? (I just pushed the commit correcting a minor bug there :)

Misha

On Fri, Dec 18, 2020 at 10:19 AM Ruiqi-CUB notifications@github.com wrote:

Hi Misha, I just send you all the files to your email. Please let me know if you can access them! Thanks a lot!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748184171, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGGPZIXCYE3UUELY2DTSVN6JLANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748269651, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIUDRHVE5CVW3P7GUQLSVOSXLANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB commented 3 years ago

Just want to confirm, is dissim(GO division)(go-to-gene table filename) the same given the same GO division and go-to-gene table filename? Even if input filenames are different?

I am using a loop in R to perform GOMWU for several datasets sharing the same go-to-gene table filename. I just found out that dissim(GO division)_(go-to-gene table filename) is overwriten everytime performing GO_MWU in the same GO divison, e.g. input1.txt with BP and input2.txt with BP.

z0on commented 3 years ago

Yes, this is correct. My code is not very efficient, but it works :)

On Fri, Dec 18, 2020 at 7:03 PM Ruiqi-CUB notifications@github.com wrote:

Just want to confirm, is dissim(GO division)(go-to-gene table filename) the same given the same GO division and go-to-gene table filename? Even if input filenames are different?

I am using a loop in R to perform GOMWU for several datasets sharing the same go-to-gene table filename. I just found out that dissim(GO division)_(go-to-gene table filename) is overwriten everytime performing GO_MWU in the same GO divison, e.g. input1.txt with BP and input2.txt with BP.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748394068, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGGZA76YK3TG2XAD3UTSVP3UXANCNFSM4US6PVXA .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB commented 3 years ago

Thanks a lot!

On Fri, Dec 18, 2020 at 11:26 PM Mikhail V Matz notifications@github.com wrote:

Yes, this is correct. My code is not very efficient, but it works :)

On Fri, Dec 18, 2020 at 7:03 PM Ruiqi-CUB notifications@github.com wrote:

Just want to confirm, is dissim(GO division)(go-to-gene table filename) the same given the same GO division and go-to-gene table filename? Even if input filenames are different?

I am using a loop in R to perform GOMWU for several datasets sharing the same go-to-gene table filename. I just found out that dissim(GO division)_(go-to-gene table filename) is overwriten everytime performing GO_MWU in the same GO divison, e.g. input1.txt with BP and input2.txt with BP.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748394068, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGGZA76YK3TG2XAD3UTSVP3UXANCNFSM4US6PVXA

.

-- cheers Misha matzlab.weebly.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748427913, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIRAL4JFS33ATBRN523SVRBRTANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB commented 3 years ago

Hi Misha,

I suspect that the dissim(GO division)(go-to-gene table filename) is not the same given the different input file (gene-lop10P), even if the GO division and go-to-gene table are the same.

I tried to run gomwuStats with input1.csv and input2.csv with CC and the same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. After getting the output file, I tried to run gomwuPlot. GO_MWU figure for input2 can be plotted but there is any error message for input1 as below.

 Error in `[.data.frame`(diss, goods.names, goods.names) : 
  undefined columns selected 

Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv respectively. Both figures were plotted successfully. The dissim_MF_gene2go.tab for input1 is 15.5MB while dissim_MF_gene2go.tab for input2.csv is 15.6MB.

Could you please check your code to see if that is the issue. If it is, would you mind modifying the code to rename dissim(GO division)(go-to-gene table filename) with dissim(GO division)(input filename)_(go-to-gene table filename) instead?

Thank you so much! Ruiqi

z0on commented 3 years ago

Hi Ruiqi - to be honest I would rather not touch my old perl code unless there is a critical error. If you think this is indeed an important thing to correct/add, you are welcome to create a git branch and fix that!

cheers Misha

On Sun, Dec 20, 2020 at 5:49 PM Ruiqi-CUB notifications@github.com wrote:

Hi Misha,

I suspect that the dissim(GO division)(go-to-gene table filename) is not the same given the different input file (gene-lop10P), even if the GO division and go-to-gene table are the same.

I tried to run gomwuStats with input1.csv and input2.csv with CC and the same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. After getting the input file, I tried to run gomwuPlot. GO_MWU figure for input2 can be plotted but there is any error message for input1 as below.

Error in [.data.frame(diss, goods.names, goods.names) : undefined columns selected

Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv respectively. Both figures were plotted successfully.

Could you please check your code to see if that is the issue. If it is, would you mind modifying the code to rename dissim(GO division)(go-to-gene table filename) with dissim(GO division)(input filename) instead?

Thank you so much! Ruiqi

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748689756, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGH76FL325LDVWHYSPLSV2ERJANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

Thank you so much! Would you mind pointing out where the code about saving the dissim_GO_gene2go.txt and GO_input.csv are?

On Sun, Dec 20, 2020 at 5:26 PM Mikhail V Matz notifications@github.com wrote:

Hi Ruiqi - to be honest I would rather not touch my old perl code unless there is a critical error. If you think this is indeed an important thing to correct/add, you are welcome to create a git branch and fix that!

cheers Misha

On Sun, Dec 20, 2020 at 5:49 PM Ruiqi-CUB notifications@github.com wrote:

Hi Misha,

I suspect that the dissim(GO division)(go-to-gene table filename) is not the same given the different input file (gene-lop10P), even if the GO division and go-to-gene table are the same.

I tried to run gomwuStats with input1.csv and input2.csv with CC and the same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. After getting the input file, I tried to run gomwuPlot. GO_MWU figure for input2 can be plotted but there is any error message for input1 as below.

Error in [.data.frame(diss, goods.names, goods.names) : undefined columns selected

Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv respectively. Both figures were plotted successfully.

Could you please check your code to see if that is the issue. If it is, would you mind modifying the code to rename dissim(GO division)(go-to-gene table filename) with dissim(GO division)(input filename) instead?

Thank you so much! Ruiqi

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748689756, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABZUHGH76FL325LDVWHYSPLSV2ERJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748695093, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEILIWRPDDSSVLE4RWTNBTSV2I2HANCNFSM4US6PVXA .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on commented 3 years ago

:) if I only knew exactly what needs to be changed I would have changed it...

On Sun, Dec 20, 2020 at 7:04 PM Ruiqi-CUB notifications@github.com wrote:

Thank you so much! Would you mind pointing out where the code about saving the dissim_GO_gene2go.txt and GO_input.csv are?

On Sun, Dec 20, 2020 at 5:26 PM Mikhail V Matz notifications@github.com wrote:

Hi Ruiqi - to be honest I would rather not touch my old perl code unless there is a critical error. If you think this is indeed an important thing to correct/add, you are welcome to create a git branch and fix that!

cheers Misha

On Sun, Dec 20, 2020 at 5:49 PM Ruiqi-CUB notifications@github.com wrote:

Hi Misha,

I suspect that the dissim(GO division)(go-to-gene table filename) is not the same given the different input file (gene-lop10P), even if the GO division and go-to-gene table are the same.

I tried to run gomwuStats with input1.csv and input2.csv with CC and the same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. After getting the input file, I tried to run gomwuPlot. GO_MWU figure for input2 can be plotted but there is any error message for input1 as below.

Error in [.data.frame(diss, goods.names, goods.names) : undefined columns selected

Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv respectively. Both figures were plotted successfully.

Could you please check your code to see if that is the issue. If it is, would you mind modifying the code to rename dissim(GO division)(go-to-gene table filename) with dissim(GO division)(input filename) instead?

Thank you so much! Ruiqi

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748689756, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABZUHGH76FL325LDVWHYSPLSV2ERJANCNFSM4US6PVXA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748695093, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEILIWRPDDSSVLE4RWTNBTSV2I2HANCNFSM4US6PVXA

.

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-748704751, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGBCDJVWMN3TRLUHJ5TSV2NKXANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

I guess I have to change "dissim_".$div."_".$gen2go to "dissim_".$div."_".$measure."_".$gen2go at line 153 in gomwu_b.pl and line 53 in gomwua.pl, and `in.dissim=paste("dissim",goDivision,goAnnotations,sep="")toin.dissim=paste("dissim",goDivision,input,goAnnotations,sep="_")` at line 173 in gomwu.functions.R. I have tested it with 2 input files and it work well, at least there is no error message.

Ruiqi-CUB commented 3 years ago

Hi Dr. Matz, Sorry to bother you again. I am trying to interpret the best GO table. Does level mean the GO term level? Does nseqs means the number of tested sequences(genes, isoforms, orthogroups, etc.) found associated witn the GO term? Thank you so much!

delta.rank         pval       level nseqs                                        term                          name        p.adj
  41        -871 6.495308e-05     5   205 GO:0000428;GO:0030880;GO:0016591;GO:0055029        RNA polymerase complex 5.249926e-04
z0on commented 3 years ago

yes and yes! Level is pretty non-informative since it is not standardized in any way across GO hierarchy (some functional groups have many levels, some only a few) You might wish to explore different tree-cut cutoffs to get the most reasonable summary - plot GO trees and cutoff levels by un-remarking two lines in the script (saying this just in case, you probably did that already)

(did the edit work?..)

On Tue, Dec 22, 2020 at 11:34 AM Ruiqi-CUB notifications@github.com wrote:

Hi Dr. Matz, Sorry to bother you again. I am trying to interpret the best GO table. Does level mean the GO term level? Does nseqs means the number of tested sequences(genes, isoforms, orthogroups, etc.) found associated witn the GO term? Thank you so much!

delta.rank pval level nseqs term name p.adj 41 -871 6.495308e-05 5 205 GO:0000428;GO:0030880;GO:0016591;GO:0055029 RNA polymerase complex 5.249926e-04

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-749677708, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGB6BX4SX4PEH6AJEMDSWDKALANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

Thank you so much! The tree with the cut-off line works really well! It helps me get the representative GOs from so many Go terms in my analyses! Also, the negative delta.rank (the value before pval) means down-regulation, correct?

The editting works really well with a loop in R. Since BP usually takes about one hour on a server and I have so many contrasts (3 species and 3 treatments), it is much more convenient to run gomwuStats with a loop first, then explore each one with gomwuPlot later. I have posted the changes I made in a previous comment.

I guess I have to change "dissim_".$div."_".$gen2go to "dissim_".$div."_".$measure."_".$gen2go at line 153 in gomwu_b.pl and line 53 in gomwua.pl, and `in.dissim=paste("dissim",goDivision,goAnnotations,sep="")toin.dissim=paste("dissim",goDivision,input,goAnnotations,sep="_")` at line 173 in gomwu.functions.R. I have tested it with 2 input files and it work well, at least there is no error message.

z0on commented 3 years ago

yep, negative delta-rank is down-regulation. I added your edits to the code! thanks a lot, this is really helpful.

On Tue, Dec 22, 2020 at 3:49 PM Ruiqi-CUB notifications@github.com wrote:

Thank you so much! The tree with the cut-off line works really well! It helps me get the representative GOs from so many Go terms in my analyses! Also, the negative delta.rank (the value before pval) means down-regulation, correct?

The editting works really well with a loop in R. Since BP usually takes about one hour on a server and I have so many contrasts (3 species and 3 treatments), it is much more convenient to run gomwuStats with a loop first, then explore each one with gomwuPlot later. I have posted the changes I made on a previous comment.

I guess I have to change "dissim".$div."".$gen2go to "dissim".$div."".$measure."_".$gen2go at line 153 in gomwu_b.pl and line 53 in gomwua.pl, and in.dissim=paste("dissim",goDivision,goAnnotations,sep="") to in.dissim=paste("dissim",goDivision,input,goAnnotations,sep="_") at line 173 in gomwu.functions.R. I have tested it with 2 input files and it work well, at least there is no error message.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/z0on/GO_MWU/issues/7#issuecomment-749792195, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGADNH6N3PWDDY3HJS3SWEH67ANCNFSM4US6PVXA .

Ruiqi-CUB commented 3 years ago

Thank you! I am honored to contribute your code!