Closed kearney-sp closed 3 years ago
Yeah, you're right. I forgot to implement that. Apologies. PR for that will be up soon.
wonderful - thanks!
On a similar note: how hard would it be to output grids as integers instead of float? In pairwise mode, the output values of each current map are between 0-1. Couldn't you just multiply them by, say, 10000 and output an integer instead? This would save a lot space if outputting many individual current maps. It could even be a separate flag. One complication is that the cumulative current map may then exceed max value, but you could log transform or compute cumulative on the original 0-1 values.
Sure. You could get the same savings by just rounding off to fewer digits than the standard 8 digits I currently do. I did a simple experiment to verify that this is true.
julia> a = rand(10^6);
julia> using DelimitedFiles
julia> writedlm("test1", a)
julia> writedlm("test2", round.(a,digits=3))
shell> du -sm test1 test2
19 test1
6 test2
If you're ok with this idea, I can implement it.
There's also this really experimental single precision mode that you can try. You can use it by specifying precision = single
in your INI file. This uses 32-bit floating points in all the calculations as opposed to the standard 64-bit. However I don't know how accurate the answers are. Maybe you can try this too?
I have experimented with the ‘precision = single’ flag in my INI file and get strange results – values near focal nodes are much higher relative to further nodes compared to the ‘precision = double’ option. I think what would be idea (for my application anyway) would be to reduce the precision only after the linear solve, just before writing the current map for that focal node pair. Basically rescaling the results (say via a log transform and converting to integer) and then writing that. Then you could probably get down to 16 bit. My problem isn’t memory, but rather disk space to write all of the individual current maps! I would like them all written in order to do some custom weighting post-processing.
Would this be a simple implementation? Rather than round, do a log transform, multiply by some factor (say 1000) and convert to integer?
From: Ranjan Anantharaman notifications@github.com Sent: January 9, 2019 3:03 PM To: Circuitscape/Circuitscape.jl Circuitscape.jl@noreply.github.com Cc: kearney-sp sean.durango@gmail.com; Author author@noreply.github.com Subject: Re: [Circuitscape/Circuitscape.jl] Compress grids has no effect (#168)
Sure. You could get the same savings by just rounding off to fewer digits than the standard 8 digits I currently do https://github.com/Circuitscape/Circuitscape.jl/blob/master/src/out.jl#L349 . I did a simple experiment to verify that this is true.
julia> a = rand(10^6);
julia> using DelimitedFiles
julia> writedlm("test1", a)
julia> writedlm("test2", round.(a,digits=3))
shell> du -sm test1 test2 19 test1 6 test2
If you're ok with this idea, I can implement it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Circuitscape/Circuitscape.jl/issues/168#issuecomment-452902527 , or mute the thread https://github.com/notifications/unsubscribe-auth/Ah9iUi__WMfxcNLuKf0DdVDf7XxiujAfks5vBnUigaJpZM4Z4MNN .
Sure, but I think the right approach would be to introduce a flag in the INI file which controls the number of digits of precision that the user wants in his output current map. That accomplishes the same thing plus you don't need to rescale (it doesn't make a difference if you're writing Integers or Floating points if you're writing the same number of digits.
Also, would you like to try your hand at PR? :)
OK, that makes sense. I am not used to working with .asc, to be honest! If I’m not mistaken, I think there will still need to be a rescaling step, right? Otherwise, very small values will just get set to ‘0’, rather than something near the lowest possible number with the defined number of digits.
And what is PR?? Maybe?! :)
From: Ranjan Anantharaman notifications@github.com Sent: January 9, 2019 4:07 PM To: Circuitscape/Circuitscape.jl Circuitscape.jl@noreply.github.com Cc: kearney-sp sean.durango@gmail.com; Author author@noreply.github.com Subject: Re: [Circuitscape/Circuitscape.jl] Compress grids has no effect (#168)
Sure, but I think the right approach would be to introduce a flag in the INI file which controls the number of digits of precision that the user wants in his output current map. That accomplishes the same thing plus you don't need to rescale (it doesn't make a difference if you're writing Integers or Floating points if you're writing the same number of digits.
Also, would you like to try your hand at PR? :)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Circuitscape/Circuitscape.jl/issues/168#issuecomment-452920061 , or mute the thread https://github.com/notifications/unsubscribe-auth/Ah9iUmP4wOi9Zw-ysJUmS8DA0kzVINseks5vBoQsgaJpZM4Z4MNN .
I believe people logscale their maps when they get really tiny values. They set the log_transform_maps
flag to true
. And then I end up writing 8 digits of precision of the log transformed map. You could control this 8
via another appropriately named flag, if you'd like to save space. Does this approach make sense?
And what is PR?? Maybe?! :)
PR stands for "pull request". It's just a way of asking you for code contribution. :-) Here's a simple guide on how to create one.
That seems like it would work well then - a flag to control precision. Between that and compression, the user would have a lot of control over output file size.
One more thought: In my case, however, I would like to differentiate between values lower than than the lowest allowable (given the precision) and values that have been masked. Can you think of a way to easily differentiate between these? One option would be to allow the user to define if they want to set values lower than precision to the minimum value or to zero (would require yet another flag). Otherwise, they will get set to zero and then NA once log transformed, same as the masked values.
PR stands for "pull request". It's just a way of asking you for code contribution. :-) Here's a simple guide on how to create one
Gotchya - I will look at the pull request guide! This is my first foray into Julia, but perhaps in the near future I can contribute.
Let's close if implemented.
Now that .tif writing is implemented on master you can get some pretty good file size savings by setting write_as_tif = true
in the .ini. GeoTIFFs will be written with lossless LZW compression by default, which will make file sizes much smaller than an equivalent ASCII. That plus the single precision option now implemented sufficiently addresses file size issues in general IMO (when single precision is used, 32-bit tiffs will be written instead of 64-bit, saving even more space).
Close due to implementation of compression when writing as tif?
When "compress_grids = True", output current maps are not compressed in Circuitscape.jl. I have only tried running pairwise mode. In Circuitscape 4.05, output is compressed into .gz format as expected when running the same .ini file.