openconnectome / FlashR--

Apache License 2.0
1 stars 1 forks source link

button to download graph, derivates, and code to generate them #12

Closed jovo closed 8 years ago

jovo commented 9 years ago

i think it'd be nice to be able to download the graphml file and select which statistics, layouts, partitions, etc., one also wants to download, as well as the code, random seed, etc. i have no idea how difficult/interesting this is to implement, so if terrible, just ignore :)

dmarchette commented 9 years ago

Some of this is easy. Some may be tricky. It depends on what you want. The big deal is figuring out how to specify stuff. Some ideas:

  1. Allow one to load a layout for graph plotting. As it is now, one can put attributes x, y and (if desired) z on each vertex, and choosing Coordinates in the layout will use these. Right now these are hard-code to be named x, y, z but I am going to modify it so that you can check the attributes you want for x, y and z. Alternatively, one could allow the user to select a csv (maybe other formats) with x,y and (if desired) z, with either vertex id in a column or the order of the rows indicates the order of the vertices.
  2. Something I want to do, but didn't work the first time I tried it, and may be tedious (but I have an idea) is a 'save state'/'load state' button. This would allow one to set up the graph layout, statistics, whatever (say on one graph) then save this out so that (a) one can come back to the same state later on (b) one could do "the same thing" to another graph. Some issues need to be addressed, like what if the graphs have different attributes and stuff.
  3. Save specific analyses. So, if I've computed something cool, I want to be able to (a) save any plots (not take a screenshot, which is what I have to do now) (b) save out any statistics into a table (csv or RData or something) (c) this is a bit wilder: save out the code that will perform the analyses. So for the last, suppose I figure out what I want to do using a small network, and I then want to do that on a big one. I could save the code as an R file, then offline source the R file (maybe after changing a few things to utilize parallel processing, or some other graph software for computing things, or some such). This might be a bit tricky, we'll see.
  4. Somewhat unrelated: Right now the communities stuff runs whatever community algorithm, but there's not much analysis of the communities. At the very least, one should be able to correlate attributes to community, plot them, use communities to stratify various statistics, subgraph by communities and do stuff, all that. Not sure how best to do this.

On Tue, Jun 23, 2015 at 12:36 AM, joshua vogelstein < notifications@github.com> wrote:

i think it'd be nice to be able to download the graphml file and select which statistics, layouts, partitions, etc., one also wants to download, as well as the code, random seed, etc. i have no idea how difficult/interesting this is to implement, so if terrible, just ignore :)

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/FlashR/issues/12.

dmarchette commented 9 years ago

Take a look. If you select "Coordinates" as the plotting method, it brings up a tool to let you choose the vertex attributes you want to use as the coordinates. I think I want to turn these into radio buttons, instead, but let me know. One thing this way gives you is the ability to order the variables and move their order around, which radio buttons wouldn't allow. What it does with categoricals is that it gives them an arbitrary number. So, if there are 8 cell types, then the cell type coordinates will be the numbers 1 to 8. Numerical vectors are numerical.

Known bugs:

  1. I don't check for NA values -- I need to map these to something, otherwise who knows what it's doing.
  2. It doesn't handle non-vector attributes. One should be able to have a vertex attribute that is "Coordinates", a 2- or 3-vector for each vertex. I'm not sure what will happen in this case, but it won't be pretty.
  3. The igraph plotter scales so the coordinates are nice. The fast plotter doesn't. For now, this is a feature. (If you haven't noticed, one of the things the igraph plotter does is multiple edges, loops, and non-straight edges, none of which my fast version does -- which is one reason it's faster).
  4. There's a bit of a kludge going on as you select variables until you have at least 2 and it can plot. I'm not sure that's a problem, but it is inellegant.

Next up: a checkbox to select a file from the disk. What it will do (initially) is let you read in a csv table, and it will populate the coordinates list from the columns of this. Same deal with categoricals. Note: It will take the coordinates in order, up to the number of vertices, and if there aren't enough rows, I'll probably do something like wrap the data back around.

That'll take a day or two, I'll let you know.

On Tue, Jun 23, 2015 at 2:55 PM, David Marchette dmarchette@gmail.com wrote:

Some of this is easy. Some may be tricky. It depends on what you want. The big deal is figuring out how to specify stuff. Some ideas:

  1. Allow one to load a layout for graph plotting. As it is now, one can put attributes x, y and (if desired) z on each vertex, and choosing Coordinates in the layout will use these. Right now these are hard-code to be named x, y, z but I am going to modify it so that you can check the attributes you want for x, y and z. Alternatively, one could allow the user to select a csv (maybe other formats) with x,y and (if desired) z, with either vertex id in a column or the order of the rows indicates the order of the vertices.
  2. Something I want to do, but didn't work the first time I tried it, and may be tedious (but I have an idea) is a 'save state'/'load state' button. This would allow one to set up the graph layout, statistics, whatever (say on one graph) then save this out so that (a) one can come back to the same state later on (b) one could do "the same thing" to another graph. Some issues need to be addressed, like what if the graphs have different attributes and stuff.
  3. Save specific analyses. So, if I've computed something cool, I want to be able to (a) save any plots (not take a screenshot, which is what I have to do now) (b) save out any statistics into a table (csv or RData or something) (c) this is a bit wilder: save out the code that will perform the analyses. So for the last, suppose I figure out what I want to do using a small network, and I then want to do that on a big one. I could save the code as an R file, then offline source the R file (maybe after changing a few things to utilize parallel processing, or some other graph software for computing things, or some such). This might be a bit tricky, we'll see.
  4. Somewhat unrelated: Right now the communities stuff runs whatever community algorithm, but there's not much analysis of the communities. At the very least, one should be able to correlate attributes to community, plot them, use communities to stratify various statistics, subgraph by communities and do stuff, all that. Not sure how best to do this.

On Tue, Jun 23, 2015 at 12:36 AM, joshua vogelstein < notifications@github.com> wrote:

i think it'd be nice to be able to download the graphml file and select which statistics, layouts, partitions, etc., one also wants to download, as well as the code, random seed, etc. i have no idea how difficult/interesting this is to implement, so if terrible, just ignore :)

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/FlashR/issues/12.

jovo commented 9 years ago

i love the save state idea, i think we can support that by having people log in to OCP, using their OCP account, and they automatically get some scratch space, which they can use to save state, analyses, etc. and they can dump it to some other place (eg, their local computer, some cloud storage, etc.), if they want.

On Tue, Jun 23, 2015 at 2:55 PM, David Marchette notifications@github.com wrote:

Some of this is easy. Some may be tricky. It depends on what you want. The big deal is figuring out how to specify stuff. Some ideas:

  1. Allow one to load a layout for graph plotting. As it is now, one can put attributes x, y and (if desired) z on each vertex, and choosing Coordinates in the layout will use these. Right now these are hard-code to be named x, y, z but I am going to modify it so that you can check the attributes you want for x, y and z. Alternatively, one could allow the user to select a csv (maybe other formats) with x,y and (if desired) z, with either vertex id in a column or the order of the rows indicates the order of the vertices.
  2. Something I want to do, but didn't work the first time I tried it, and may be tedious (but I have an idea) is a 'save state'/'load state' button. This would allow one to set up the graph layout, statistics, whatever (say on one graph) then save this out so that (a) one can come back to the same state later on (b) one could do "the same thing" to another graph. Some issues need to be addressed, like what if the graphs have different attributes and stuff.
  3. Save specific analyses. So, if I've computed something cool, I want to be able to (a) save any plots (not take a screenshot, which is what I have to do now) (b) save out any statistics into a table (csv or RData or something) (c) this is a bit wilder: save out the code that will perform the analyses. So for the last, suppose I figure out what I want to do using a small network, and I then want to do that on a big one. I could save the code as an R file, then offline source the R file (maybe after changing a few things to utilize parallel processing, or some other graph software for computing things, or some such). This might be a bit tricky, we'll see.
  4. Somewhat unrelated: Right now the communities stuff runs whatever community algorithm, but there's not much analysis of the communities. At the very least, one should be able to correlate attributes to community, plot them, use communities to stratify various statistics, subgraph by communities and do stuff, all that. Not sure how best to do this.

On Tue, Jun 23, 2015 at 12:36 AM, joshua vogelstein < notifications@github.com> wrote:

i think it'd be nice to be able to download the graphml file and select which statistics, layouts, partitions, etc., one also wants to download, as well as the code, random seed, etc. i have no idea how difficult/interesting this is to implement, so if terrible, just ignore :)

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/FlashR/issues/12.

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/FlashR/issues/12#issuecomment-114607372 .

the glass is all full: half water, half air. openconnecto.me, jovo.me, office hours https://www.google.com/calendar/embed?src=e2ktu4lrgul8anp8hclrcminp8%40group.calendar.google.com&ctz=America/New_York

jovo commented 9 years ago

cool. do you have any ideas for how to do it when the number of vertices is like 100,000? perhaps the current solution just 'works'?

On Tue, Jun 23, 2015 at 7:37 PM, David Marchette notifications@github.com wrote:

Take a look. If you select "Coordinates" as the plotting method, it brings up a tool to let you choose the vertex attributes you want to use as the coordinates. I think I want to turn these into radio buttons, instead, but let me know. One thing this way gives you is the ability to order the variables and move their order around, which radio buttons wouldn't allow. What it does with categoricals is that it gives them an arbitrary number. So, if there are 8 cell types, then the cell type coordinates will be the numbers 1 to 8. Numerical vectors are numerical.

Known bugs:

  1. I don't check for NA values -- I need to map these to something, otherwise who knows what it's doing.
  2. It doesn't handle non-vector attributes. One should be able to have a vertex attribute that is "Coordinates", a 2- or 3-vector for each vertex. I'm not sure what will happen in this case, but it won't be pretty.
  3. The igraph plotter scales so the coordinates are nice. The fast plotter doesn't. For now, this is a feature. (If you haven't noticed, one of the things the igraph plotter does is multiple edges, loops, and non-straight edges, none of which my fast version does -- which is one reason it's faster).
  4. There's a bit of a kludge going on as you select variables until you have at least 2 and it can plot. I'm not sure that's a problem, but it is inellegant.

Next up: a checkbox to select a file from the disk. What it will do (initially) is let you read in a csv table, and it will populate the coordinates list from the columns of this. Same deal with categoricals. Note: It will take the coordinates in order, up to the number of vertices, and if there aren't enough rows, I'll probably do something like wrap the data back around.

That'll take a day or two, I'll let you know.

On Tue, Jun 23, 2015 at 2:55 PM, David Marchette dmarchette@gmail.com wrote:

Some of this is easy. Some may be tricky. It depends on what you want. The big deal is figuring out how to specify stuff. Some ideas:

  1. Allow one to load a layout for graph plotting. As it is now, one can put attributes x, y and (if desired) z on each vertex, and choosing Coordinates in the layout will use these. Right now these are hard-code to be named x, y, z but I am going to modify it so that you can check the attributes you want for x, y and z. Alternatively, one could allow the user to select a csv (maybe other formats) with x,y and (if desired) z, with either vertex id in a column or the order of the rows indicates the order of the vertices.
  2. Something I want to do, but didn't work the first time I tried it, and may be tedious (but I have an idea) is a 'save state'/'load state' button. This would allow one to set up the graph layout, statistics, whatever (say on one graph) then save this out so that (a) one can come back to the same state later on (b) one could do "the same thing" to another graph. Some issues need to be addressed, like what if the graphs have different attributes and stuff.
  3. Save specific analyses. So, if I've computed something cool, I want to be able to (a) save any plots (not take a screenshot, which is what I have to do now) (b) save out any statistics into a table (csv or RData or something) (c) this is a bit wilder: save out the code that will perform the analyses. So for the last, suppose I figure out what I want to do using a small network, and I then want to do that on a big one. I could save the code as an R file, then offline source the R file (maybe after changing a few things to utilize parallel processing, or some other graph software for computing things, or some such). This might be a bit tricky, we'll see.
  4. Somewhat unrelated: Right now the communities stuff runs whatever community algorithm, but there's not much analysis of the communities. At the very least, one should be able to correlate attributes to community, plot them, use communities to stratify various statistics, subgraph by communities and do stuff, all that. Not sure how best to do this.

On Tue, Jun 23, 2015 at 12:36 AM, joshua vogelstein < notifications@github.com> wrote:

i think it'd be nice to be able to download the graphml file and select which statistics, layouts, partitions, etc., one also wants to download, as well as the code, random seed, etc. i have no idea how difficult/interesting this is to implement, so if terrible, just ignore :)

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/FlashR/issues/12.

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/FlashR/issues/12#issuecomment-114674690 .

the glass is all full: half water, half air. openconnecto.me, jovo.me, office hours https://www.google.com/calendar/embed?src=e2ktu4lrgul8anp8hclrcminp8%40group.calendar.google.com&ctz=America/New_York