OSS-Lab / MetQy

Repository for R package MetQy (read related publication here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6247936/)
Other
18 stars 9 forks source link

Information request regarding file upload #2

Open stefcamp opened 5 years ago

stefcamp commented 5 years ago

Dear Andreas, I am trying to use MetQy following the manual provided, but I am a little bit lost because I am not very good in using R. The software is properly installed in my computer, I followed examples and they provide results (except some figures e.g. sunburst are missing the text). Basically I have some genomes annotated using KEGG and I would simply perform the “query_genomes_to_modules” with the “user-specified gene sets”. My input file has header and is organized as you suggests in the example:

ID ORG_ID ORGANISM KOs ECs T09999 aaa A K00013;K00014;K00018;… “empty field”

Tabular values separate the different fields (ID ORG_ID ORGANISM KOs ECs) in the header and also in the first line. Is this correct? I do not have EC numbers (empty field), only KEGG IDs for genes. Could you please report some minimal command lines to do the following: 1-Import the file in R in order to be usable from your software; 2- Calculate the module completion fraction (mcf) for all the modules; 3-Export to a text file the mcf values obtained for all the pathways. Thanks a lot in advance for your help. Sincerely

asmvernon commented 5 years ago

Hi Stephano, Thank you for your interest in MetQy and your patience while waiting for my reply.

Below I've included some code that will helpfully be useful. Note that when reading the data, you must specify the stringsAsFactors = F. Otherwise, the character variables are imported as factors (which have a class attribute) and are not compatible with the MetQy function.

Do let me know if you have any more questions!

All the best, Andrea

## USE THE EXAMPLE DATA
data(data_example_multi_ECs_KOs)
write.csv(data_example_multi_ECs_KOs,file = "data_example_multi_ECs_KOs.csv")

## IMPORT DATA
myData <- read.csv("data_example_multi_ECs_KOs.csv",header = T,stringsAsFactors = F)

## CALCULATE THE MCF
OUT_myData <- query_genomes_to_modules(myData,GENOME_ID_COL = "ID",
                                       GENES_COL = "KOs",
                                       MODULE_ID = paste("M0000",1:5,sep=""),
                                       META_OUT=T,ADD_OUT=T)

## WRITE THE DATA AS A .csv
write.csv(OUT_myData$MATRIX,file = "mcf_matrix.csv")