spholmes / F1000_workflow

43 stars 33 forks source link

creating and plotting graphs from Workflow for Microbiome data #25

Closed Mdrexel2018 closed 5 years ago

Mdrexel2018 commented 5 years ago

Hello! I am new to coding and I am using the 'Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses'. I began with the DADA2 pipeline and am using that for other analyses. I am hoping that I can add some of these analyses as well. I am trying to go through the section under creating and plotting graphs. In order to work it out on my computer (windows 10) before I use my own data just to make sure that I can do it correctly. I am using the sample data ps and running thorough the code. I am currently stuck on this part of it. I am able to successfully got through the first 5 lines but as soon as I get to the V(net)... portion I get an error message (seen below). I am not sure what it means and how to move forward.

library("phyloseqGraphTest") library("igraph") library("ggnetwork") net <- make_network(ps, max.dist=0.35) sampledata <- data.frame(sample_data(ps)) V(net)$id <- sampledata[names(V(net)), "host_subject_id"] V(net)$litter <- sampledata[names(V(net)), "family_relationship"]

Error in V<-(tmp, value = 1:4) : invalid indexing

Thanks

spholmes commented 5 years ago

Hi, Could you tell us a little more about your R and package versions and what does sessionInfo() say for instance.

Also just to check, if you type ps what does it show?

It may be a new version of ggnetwork is throwing things off.

Usually a good idea to look at actually how some of the tools work in more detail you might benefit from looking at chapter 10 of the book here as well:

bios221.stanford.edu/book/

we'll try to respond according to what issues we see once we have the information, Thanks Susan

On Wed, Jul 25, 2018 at 7:30 AM, Mdrexel2018 notifications@github.com wrote:

Hello! I am new to coding and I am using the 'Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses'. I began with the DADA2 pipeline and am using that for other analyses. I am hoping that I can add some of these analyses as well. I am trying to go through the section under creating and plotting graphs. In order to work it out on my computer (windows 10) before I use my own data just to make sure that I can do it correctly. I am using the sample data ps and running thorough the code. I am currently stuck on this part of it. I am able to successfully got through the first 5 lines but as soon as I get to the V(net)... portion I get an error message (seen below). I am not sure what it means and how to move forward.

library("phyloseqGraphTest") library("igraph") library("ggnetwork") net <- make_network(ps, max.dist=0.35) sampledata <- data.frame(sample_data(ps)) V(net)$id <- sampledata[names(V(net)), "host_subject_id"] V(net)$litter <- sampledata[names(V(net)), "family_relationship"]

Error in V<-(tmp, value = 1:4) : invalid indexing

Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/spholmes/F1000_workflow/issues/25, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvV-yqM61l3x-mQljfP2GV5oLCGsMks5uKID4gaJpZM4VgMX6 .

-- Susan Holmes John Henry Samter Fellow in Undergraduate Education Professor, Statistics 2017-2018 CASBS Fellow, Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/

Mdrexel2018 commented 5 years ago

Hi,

These are the results that I get when I type in those two things.

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached): [1] compiler_3.5.1 magrittr_1.5 Matrix_1.2-14 tools_3.5.1 igraph_1.2.1
[6] grid_3.5.1 pkgconfig_2.0.1 lattice_0.20-35

ps Error: object 'ps' not found

I am not sure why ps is not found since I have downloaded that portion of the data. How would I get an older version of ggnetwork?

I will also check out chapter 10 thank you!

jfukuyama commented 5 years ago

I can get a similar error if I'm trying to assign bad values to V(net)$id or V(net)$litter, so I suspect the problem is that one of the upstream objects didn't get created correctly. Try running the commands and making sure that both net and sampledata are all there, something like the following:

> net = make_network(ps, max.dist = .35)
> net
IGRAPH 72ee1a0 UN-- 181 405 -- 
+ attr: name (v/c)
+ edges from 72ee1a0 (vertex names):
 [1] F3D148--F3D149 F3D148--F3D15  F3D148--F3D17  F3D15 --F3D17  F3D144--F3D25 
 [6] F3D145--F3D364 F3D8  --F3D9   F3D147--F4D125 F3D17 --F4D13  F3D149--F4D141
[11] F4D141--F4D142 F4D125--F4D144 F4D143--F4D145 F4D125--F4D147 F3D147--F4D148
[16] F4D125--F4D148 F4D141--F4D148 F4D144--F4D148 F4D147--F4D148 F3D147--F4D149
[21] F4D144--F4D150 F4D148--F4D150 F3D15 --F4D25  F3D17 --F4D25  F3D3  --F4D3  
[26] F3D6  --F4D6   F4D5  --F4D6   F4D146--F4D65  F4D17 --F4D65  F4D6  --F4D7  
[31] F4D8  --F4D9   F3D146--F5D125 F3D25 --F5D142 F3D65 --F5D142 F5D141--F5D146
[36] F5D142--F5D146 F3D65 --F5D147 F5D142--F5D147 F3D144--F5D148 F3D25 --F5D148
+ ... omitted several edges
> head(names(V(net)))
[1] "F3D141" "F3D144" "F3D145" "F3D146" "F3D147" "F3D148"
> sampledata = data.frame(sample_data(ps))
> dim(sampledata)
[1] 360  14
> colnames(sampledata)
 [1] "collection_date"     "biome"               "target_gene"        
 [4] "target_subfragment"  "host_common_name"    "host_subject_id"    
 [7] "age"                 "sex"                 "body_product"       
[10] "tot_mass"            "diet"                "family_relationship"
[13] "genotype"            "SampleID"           
> V(net)$id = sampledata[names(V(net)), "host_subject_id"]
> V(net)$litter = sampledata[names(V(net)), "family_relationship"]
spholmes commented 5 years ago

This shows that your ps object is missing, could you load up all the librarys and the ps object again and then try to look at ps again?

On Wed, Jul 25, 2018 at 9:21 AM, Mdrexel2018 notifications@github.com wrote:

Hi,

These are the results that I get when I type in those two things.

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached): [1] compiler_3.5.1 magrittr_1.5 Matrix_1.2-14 tools_3.5.1 igraph_1.2.1 [6] grid_3.5.1 pkgconfig_2.0.1 lattice_0.20-35

ps Error: object 'ps' not found

I am not sure why ps is not found since I have downloaded that portion of the data. How would I get an older version of ggnetwork?

I will also check out chapter 10 thank you!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spholmes/F1000_workflow/issues/25#issuecomment-407813054, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvdOjEZRbjcjwkSTw4bWIpzWbuUPPks5uKJrygaJpZM4VgMX6 .

-- Susan Holmes John Henry Samter Fellow in Undergraduate Education Professor, Statistics 2017-2018 CASBS Fellow, Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/

Mdrexel2018 commented 5 years ago

Thank you for getting back to me. I tried both solutions and this is what I got.

@jfukuyama I ran all of the commands like you have written above but I still got the same error. I went through the entire pipeline and so everything should be loaded. I am trying to work it out on my computer with the sample data given with the pipeline before I use my own data and so I am using all of the same code as they do. because of that i am not sure what part of it couldn't be loaded correctly. Do you have any thoughts?

@spholmes I reloaded all of the libraries and packages and then typed in ps again. This is what I got:

ps phyloseq-class experiment-level object otu_table() OTU Table: [ 2040 taxa and 14 samples ] sample_data() Sample Data: [ 14 samples by 3 sample variables ] tax_table() Taxonomy Table: [ 2040 taxa by 6 taxonomic ranks ]

So I am assuming that the ps object is now able to be found. I ran through that section of code once more and got the same error:

V(net)$id <- sampledata[names(V(net)), "host_subject_id"] Error in V<-(*tmp*, value = 1:4) : invalid indexing

Also earlier you had suggested that I use an older version of ggnetwork. I tried that but it did not seem to change anything.

package.version('ggnetwork') [1] "0.5.1"

Thank you so much for helping me!

Mdrexel2018 commented 5 years ago

Hello,

I tried some more things and found that the problem was that the data was in an rds file. I fixed it by making it an object and then substituting that object into the code:

t <- readRDS("C:/Users/x x/Desktop/Drexel 2018/Step1_FASTQ/ts.rds") net <- make_network(t, max.dist=0.35) sampledata <- data.frame(sample_data(t)) V(net)$id <- sampledata[names(V(net)), "host_subject_id"] V(net)$litter <- sampledata[names(V(net)), "family_relationship"] ggplot(net, aes(x = x, y = y, xend = xend, yend = yend), layout = "fruchtermanreingold") + geom_edges(color = "darkgray") + geom_nodes(aes(color = id, shape = litter), size = 3 ) + theme(axis.text = element_blank(), axis.title = element_blank(), legend.key.height = unit(0.5,"line")) + guides(col = guide_legend(override.aes = list(size = .5)))

once I did that it worked! Again thank you so much for your help.

spholmes commented 5 years ago

Hey, this is great: thanks so much: I was about to sit down and redo the whole workflow because I need it for a course and having an error in there is a real problem, I am so glad you figured it out and I am sure you also learnt a lot along the way!! Congrats, Susan

On Mon, Jul 30, 2018 at 8:57 AM, Mdrexel2018 notifications@github.com wrote:

Hello,

I tried some more things and found that the problem was that the data was in an rds file. I fixed it by making it an object and then substituting that object into the code:

t <- readRDS("C:/Users/x x/Desktop/Drexel 2018/Step1_FASTQ/ts.rds") net <- make_network(t, max.dist=0.35) sampledata <- data.frame(sample_data(t)) V(net)$id <- sampledata[names(V(net)), "host_subject_id"] V(net)$litter <- sampledata[names(V(net)), "family_relationship"] ggplot(net, aes(x = x, y = y, xend = xend, yend = yend), layout = "fruchtermanreingold") + geom_edges(color = "darkgray") + geom_nodes(aes(color = id, shape = litter), size = 3 ) + theme(axis.text = element_blank(), axis.title = element_blank(), legend.key.height = unit(0.5,"line")) + guides(col = guide_legend(override.aes = list(size = .5)))

once I did that it worked! Again thank you so much for your help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spholmes/F1000_workflow/issues/25#issuecomment-408915053, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvYG__d51sJr57XT24_nOd1kX-Re1ks5uLyz9gaJpZM4VgMX6 .

-- Susan Holmes John Henry Samter Fellow in Undergraduate Education Professor, Statistics 2017-2018 CASBS Fellow, Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/