Map Directory discussion

missyschoenbaum commented 10 years ago

Capturing discussion from Josiah Ok, now that I see that it's a directory structure and given that we already have a managed file solution, I'm definitely inclined to go with just a checkbox that says "Output Map Directory" (which we already have). I'll create a directory called _map_output in the workspace. Then it can have the same override behavior as the rest of the Results.

Though that leads naturally to the next question, do we really want new Results to delete and replace old Results when you run a simulation? Those could be separated if it's desirable. Or we could have a "Results" directory where we just create a new DB file every time the simulation is run again. I think the concern is that we don't want old results associated with a Scenario whose parameters have changed.

On Wed, May 7, 2014 at 4:10 PM, Schoenbaum, Melissa - APHIS Melissa.Schoenbaum@aphis.usda.gov wrote: Neil, does any part of this happen in your world? II may need better specs on this (and it may drop to a lower priority).

1) Can you specify a file name for your map output? No, it is generated 2) Is it alright to output multiple scenarios to the same directory? Seems OK to me as long as they have unique names 3) Do we even need this parameter at all? We could just prepend the scenario file name (unique) to the _map.csv. If Neil doesn’t need it, would we just make it a checkbox Y/N to get map output?

Here’s what happens when I run it…

It looks like it does prepend and build a name-see the one I just ran today

It creates a directory per iteration With multiple files

From: Josiah Seaman [mailto:josiah@newlinetechnicalinnovations.com] Sent: Wednesday, May 07, 2014 3:55 PM To: Neil Harvey; Schoenbaum, Melissa - APHIS Cc: Pyle, Alexander - OCIO/EAS Subject: Map Directory

Hello, I was just working through Missy's test cases on #27. I was hoping someone could explain for me why we specify a "daily states file" but for map we specify a "map directory"? Does this output multiple files? The way I have the file system setup at the moment (sandboxed) the only directory you can output to is the "workspace" directory, which is fixed.

1) Can you specify a file name for your map output? 2) Is it alright to output multiple scenarios to the same directory? 3) Do we even need this parameter at all? We could just prepend the scenario file name (unique) to the _map.csv.

ndh2 commented 10 years ago

In NAADSM 3.2.19, in the Output Options screen, there is an option to turn on/off "NAADSMap output". There is also a "Folder for NAADSMap output" setting. I believe NAADSMap was an external visualization system; I'm fairly sure these options have no relation to the weekly_gis_writer and summary_gis_writer modules in the C code.

missyschoenbaum commented 10 years ago

more research on this functionality - I asked Shaun about map outputs are generated. His answer:

I’ve located the place where the NAADSMap outputs are written in 3.2.19. It is in the Delphi source code in a program file named NAADSMap.pas. It appears to be, basically, a copy of the data that would already be found in Neil’s normal SC output.

missyschoenbaum commented 10 years ago

Team discussed this - Tim is the only one that has used map outputs. He said he only uses one of the 5 files that a produced. Since it is by iteration, he had to pull each file into GIS and manipulate it (so, 1,000 imports and manipulations in his example - this made me hurt). It was the DaysPremStat file. From what I can tell, it is a union of 3 datasets: the initial state, the daily state, and a final state and it is only looking at detected units within the daily state section. I also have this question out to Barbara to see if she knows more. I am going to propose that instead of recreating this whole directory/file structure, that we create the single file that seems to be used and provide in outputs by iteration.

What this means to dev cycle: another file added to output file requirements, remove the whole GUI interaction with making an output directory, remove functionality of creating the map outputs (which I had not spec'd in an adequate fashion to start with)

Once I hear from Barbara, I will make an issue to support this.

josiahseaman commented 10 years ago

Let's see if I understand this well enough to summarize: 1) Most of the map output is unneeded because it is redundant with the event information that is already in the simulation output. 2) The useful part is a timeline of Unit state changes with locations on them. This information can be extracted for each iteration, but it's nice to have it all in one place.

In summary, this is a data visualization convenient organization. It's useful for the GUI to sort and extract a movie of Unit state changes that show the outbreak progressing, so that you can animate a map. This information could also be exported, but in that case it should be in a standard GIS format. Unless there is immediate demand for GIS export, this sounds like a solid Phase 3 Data Visualization project.

I would use the Google Maps API where each event was an overlay icon on the map. I'd add my own button for "Next, Previous, Play" and "Prev / Next Iteration". Then we can use javascript to just draw overlays onto the map as they happen.

missyschoenbaum commented 10 years ago

It is a good data visualization candidate.
It can be extracted, but requires 3 queries (with nested subselects) that use a union to build the final product.
The way Tim used it was to calculate samples required for adequate surveillance - which uses a a set of rules that are dictated at an international level. For example, when a unit has been infected there is a requirement to take x samples per Y animals in a z radius - just like zones there are different levels, so as you move away from the infection, the samples drop. So he needed the spatial operations to determine the pool of candidates to sample, and he did this 1,000 times.

josiahseaman commented 10 years ago

The fact that it is a complicated yet constant database query makes it a very good candidate for being hard coded into the GUI on the Outputs side. We don't even necessarily need to have an "Outputs Setting" mention of it. Map output could just be one of the windows you can open up with a "Populate Map" button. If the preferable export format is a database, we could create a table that remains blank until you hit the button.

The difference between flagging it as desired output at the beginning or the end is really just a matter of speed. I would bet that the database could optimize the filter/sort/populate process if done in batch.

missyschoenbaum commented 10 years ago

I was happy with my solution, until I started calculating up the number of records that was going to generate. For example, Amy's Texas file would have almost 800,000 records just to set the baseline and final day. Add in the daily events, and that likely goes to a million. And, if I run 1,000 iterations then we have lots of records. I am trying to pull the right block of events off her SC file to get a more realistic view of this.

josiahseaman commented 9 years ago

This is useful conversation for later, but for now, Map Outputs go in the Supplemental Files #161. There is a directory for every scenario and a Map subdirectory. People are free to browse this on their machines. No data visualization is necessary right now. The final thing need to do with Map Output is just some file handling #181.

missyschoenbaum commented 9 years ago

Sounds good. I would rather figure out a programmatic way to manipulate geospatial files, and that is outside the scope of this project.

NAVADMC / ADSM

Map Directory discussion #51