linkagescape / linkage-mapper

ArcGIS tools to automate mapping and prioritization of wildlife habitat corridors
https://circuitscape.org/linkagemapper/
GNU General Public License v3.0
39 stars 12 forks source link

Improve intermediate file management for CLM #39

Closed dkav closed 6 years ago

dkav commented 6 years ago

Save intermediate files in scratch dir and, where possible, within a geodatabase.

No longer keep copies of input files.

Use arcpy.delete to insure removal of associated spatial files (e.g. *.proj).

For now, do not delete scratch gdb as arcpy is not releasing locks on it.

johngallo commented 6 years ago

One think we added for Linkage Priority Tool (or Linkage Mapper itself? @rgreenefl feel free to comment if you remember off the top of your head) is a function that copies the input files chosen by the user into a section of the outputs directory. This has proven invaluable in the following use case. The analyst runs hundreds of model runs on a virtual machine and then copies over the Outputs folder from the dozen or so focal species and structural connectivity runs across the internet to a local machine. Then, months later they are double checking methods documentation to be sure they used a particular resistance surface for a particular model run. It was great to double check documentation with the file that was in the outputs folder.

@dkav judging by your comment CLM used to do this but you are proposing removing that function in this PR? Or am I misunderstanding what "No longer keep copies of input files" means. I need to run now so figured it would be better to ask now rather than dig later.

dkav commented 6 years ago

CLM is saving the core feature class and resistance raster but not the Climate Raster, so it is inconsistent in how it is saving. Nor is there any documentation informing the user that (some) input files are being copied and saved when the model is run. Also, copied files are not the original inputs but versions clipped to the Climate raster extent.

From what I can tell, not saving inputs is consistent with Linkage Mapper itself.

I originally had your perspective about keeping the files but I changed my mind. I have taken the view that copying the data is needless duplication. If the files are large and/or disk space is limited, duplicate data could be problematic. To insure that inputs are associated with outputs should, in my mind, be the responsibility of the user. Having a good work flow will insure that this relationship is maintained.

johngallo commented 6 years ago

Would you call it "needless duplication" or that the "costs don't outweigh the benefits"? How about if we give the power user the option to keep the inputs, using the lm_settings file?

dkav commented 6 years ago

I don’t have a problem having an option added but let’s put that as a separate task. The new task (#40) can build on this merge request.

johngallo commented 6 years ago

Excellent idea to split the issues. Nice. Onward!