z0on / emapper_to_GOMWU_KOGMWU

A few awk one-liners to extract data tables compatible with GO_MWU and KOGMWU methods out of eggNOG-eMapper output
6 stars 2 forks source link

KOGMWU error about differing rows #3

Closed yaaminiv closed 1 year ago

yaaminiv commented 1 year ago

Thanks for creating the KOGMWU package. Not sure if this is the best venue to post a question, but I'm running into the following error about differing rows when doing a KOGMWU analysis:

kog.mwu(temp, ApalmKOG, Alternative = "t")

Warning: NAs introduced by coercionWarning: cannot xtfrm data frames[1] "Continuous measure of interest: will perform MWU test"
[1] "skipping integer(0) nseqs = 1"
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 1, 0

6. stop(gettextf("arguments imply differing number of rows: %s", paste(unique(nrows), collapse = ", ")), domain = NA)
5. data.frame(..., check.names = FALSE)
4. cbind(deparse.level, ...)
3. cbind(term = as.character(terms), res)
2. kog.mwut(annotated, Alternative)
1. kog.mwu(temp, ApalmKOG, Alternative = "t")

My input files:

data.txt gene2kog.txt

Is there a reason why I'm getting this error? From my understanding, the two input data frames can have differing rows. Thanks for your help!

z0on commented 1 year ago

Hi - thanks for giving my package a shot! The data files look fine, but are you sure they are imported correctly into R? I mean, I would check that my input datasets (temp and ApalmKOG) look like nice healthy data frames in R.

On Mon, Mar 6, 2023 at 2:08 PM Dr. Yaamini R. Venkataraman < @.***> wrote:

Thanks for creating the KOGMWU package. Not sure if this is the best venue to post a question, but I'm running into the following error about differing rows when doing a KOGMWU analysis:

kog.mwu(temp, ApalmKOG, Alternative = "t")

Warning: NAs introduced by coercionWarning: cannot xtfrm data frames[1] "Continuous measure of interest: will perform MWU test" [1] "skipping integer(0) nseqs = 1" Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 1, 0

  1. stop(gettextf("arguments imply differing number of rows: %s", paste(unique(nrows), collapse = ", ")), domain = NA)
  2. data.frame(..., check.names = FALSE)
  3. cbind(deparse.level, ...)
  4. cbind(term = as.character(terms), res)
  5. kog.mwut(annotated, Alternative)
  6. kog.mwu(temp, ApalmKOG, Alternative = "t")

My input files:

data.txt https://github.com/z0on/emapper_to_GOMWU_KOGMWU/files/10902367/data.txt gene2kog.txt https://github.com/z0on/emapper_to_GOMWU_KOGMWU/files/10902368/gene2kog.txt

Is there a reason why I'm getting this error? From my understanding, the two input data frames can have differing rows. Thanks for your help!

— Reply to this email directly, view it on GitHub https://github.com/z0on/emapper_to_GOMWU_KOGMWU/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGGLVNKMXUN5INA622LW2Y72HANCNFSM6AAAAAAVRSJK5M . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- cheers Misha matzlab.weebly.com

yaaminiv commented 1 year ago

Hi Misha,

The data frames look fine in R. I modeled them after the larval and adult data included with the KOGMWU package with gene and KOG data as factors.

> str(temp)
tibble [400 × 2] (S3: tbl_df/tbl/data.frame)
 $ ApalmGeneID: Factor w/ 400 levels "evm.model.hic_scaffold_1.11",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ log2fc     : num [1:400] 0.484 0.999 -0.977 -0.943 0.832 ...

>head(temp)
# A tibble: 6 × 2
  ApalmGeneID                   log2fc
  <fct>                          <dbl>
1 evm.model.hic_scaffold_1.11    0.484
2 evm.model.hic_scaffold_1.1206  0.999
3 evm.model.hic_scaffold_1.1237 -0.977
4 evm.model.hic_scaffold_1.1322 -0.943
5 evm.model.hic_scaffold_1.1436  0.832
6 evm.model.hic_scaffold_1.1476  0.440
> str(ApalmKOG)
tibble [11,697 × 2] (S3: tbl_df/tbl/data.frame)
 $ ApalmGeneID: Factor w/ 10362 levels "evm.model.hic_scaffold_1.1",..: 1 1 2 3 4 5 6 7 8 9 ...
 $ KOG        : Factor w/ 23 levels "Amino acid transport and metabolism",..: 22 17 18 18 8 21 18 21 21 21 ...

> head(ApalmKOG)
# A tibble: 6 × 2
  ApalmGeneID                             KOG                                                         
  <fct>                                   <fct>                                                       
1 evm.model.hic_scaffold_1.1              Transcription                                               
2 evm.model.hic_scaffold_1.1              Posttranslational modification, protein turnover, chaperones
3 evm.model.hic_scaffold_1.101            Replication, recombination and repair                       
4 evm.model.hic_scaffold_1.101.1.5f5b2bfd Replication, recombination and repair                       
5 evm.model.hic_scaffold_1.1013           Cytoskeleton                                                
6 evm.model.hic_scaffold_1.102            Signal transduction mechanisms  

I'm don't see any problems outright, so I'd appreciate any insight you have!

yaaminiv commented 1 year ago

Hi @z0on I'm still getting errors when I try to run KOGMWU:

> kog.mwu(data = temp, gene2kog = ApalmKOG, Alternative = "t")

Error in xtfrm.data.frame(x) : cannot xtfrm data frames
11. stop("cannot xtfrm data frames")
10. xtfrm.data.frame(x)
9. | xtfrm(x)
8. as.vector(xtfrm(x))
7. FUN(X[[i]], ...)
6. lapply(z, function(x) if (is.object(x)) as.vector(xtfrm(x)) else x)
5. order(y)
4. factor(x)
3. as.factor(annotated[, 2])
2. levels(as.factor(annotated[, 2]))
1. kog.mwu(data = temp, gene2kog = ApalmKOG, Alternative = "t")

Any suggestions for how to get around this?

z0on commented 1 year ago

Apologies for radio silence, this somehow fell behind my event horizon - this looks strange… These objects are not supposed to be tibbles, they must be simple data frames. Can you send your whole code with all input files?

On Nov 8, 2023, at 1:35 PM, Dr. Yaamini R. Venkataraman @.***> wrote:

Hi @z0on https://github.com/z0on I'm still getting errors when I try to run KOGMWU:

kog.mwu(data = temp, gene2kog = ApalmKOG, Alternative = "t")

Error in xtfrm.data.frame(x) : cannot xtfrm data frames

  1. stop("cannot xtfrm data frames")
  2. xtfrm.data.frame(x)
  3. | xtfrm(x)
  4. as.vector(xtfrm(x))
  5. FUN(X[[i]], ...)
  6. lapply(z, function(x) if (is.object(x)) as.vector(xtfrm(x)) else x)
  7. order(y)
  8. factor(x)
  9. as.factor(annotated[, 2])
  10. levels(as.factor(annotated[, 2]))
  11. kog.mwu(data = temp, gene2kog = ApalmKOG, Alternative = "t") Any suggestions for how to get around this?

— Reply to this email directly, view it on GitHub https://github.com/z0on/emapper_to_GOMWU_KOGMWU/issues/3#issuecomment-1802529009, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGATRGTGFUTQDKJNUOLYDPNHPAVCNFSM6AAAAAAVRSJK5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBSGUZDSMBQHE. You are receiving this because you were mentioned.

yaaminiv commented 1 year ago

Converting my tibbles to data frames with as.data.frame allowed me to run my data through KOGMWU! Thanks for pointing that out