Closed euronion closed 4 years ago
I guess it was possible due to the custom geokit
version used there as well.
I found the PR in the geokit
(https://github.com/FZJ-IEK3-VSA/geokit/pull/1) already trying to make this happen.
My only concern with such a function would be in a situation where someone inputs an iterable of exactly two values. For example:
ec.excludeRasterType( "path/to/file.tif", value=[5,7])
In this case it would be ambiguous if the user means to exclude exactly the values 5 & 7, or if they want to exclude the range 5-7. Maybe the solution would be to say a list implies exact values, and a tuple implies a range. But this seems unintuitive to me, and I'm also not sure what to do with derived iterables such as numpy arrays or generators.
Thoughts?
Maybe another way to handle the input would be to always expect either a numeric value (for an exact indication) or a string. Then mathematical notation could be used. E.g.:
I find the a tuple (lower, upper)
or even slice(start, stop)
to be intuitive for ranges and lists of values for single values to exclude.
I think parsing mathematical notation is tempting but makes the code more complicated than necessary.
Or maybe more pythonic: keep value=
for a single value / a range of values and introducevalues= [...]
for a list of values to consider.
I think parsing mathematical notation is tempting but makes the code more complicated than necessary.
I can see your point here. But do you think it would be SO bad? I think doing this would fit well into a newer feature I've recently added which you may not be aware of. The purpose of this is to allow the complete description of the steps in an eligibility analysis to come from a CSV file or database (anything which translates to a pandas dataframe, actually). This allows various constraint scenarios for multiple geo-political regions to be stored and altered as git- and user-friendly text files.
Or maybe more pythonic: keep
value=
for a single value / a range of values and introducevalues= [...]
for a list of values to consider.
This doesn't quite fit from my perspective, since a single range indication also involves multiple explicit values, and hence I could imagine someone making the mistake of saying values=(lower,higher), which would only indicate the explicit values, when they actually meant to have said value=(lower,higher). This would be disastrous if they were indicating from a float-typed raster file and weren't being careful to check that their output makes sense. (Unfortunately, I've seen this often :( )
Sounds like a helpful feature when doing a lot of specific or high-resolution eligibility analysis. Nice!
Confusing value
and values
is indeed a bit risky. I guess the mathematical notation would be an elegant solution. Not sure if there is a standard library for that purpose though.
Not sure if there is a standard library for that purpose though
No need for an additional module. regular (not 'renewable', uff) expressions to the rescue!
I wanted to test the idea, so I've pushed a temporary branch "math_notation" with a change that makes the indicateValues function in GeoKit behave in the way we discussed. I actually like it quite a bit, and could imagine incorporating it permanently. Anyway, before I go on I would first make sure it fits your use case. Would you mind checking it out for me? I'm also open to other approaches/suggestions you might have.
The relevant update can be found here And the docstring is updated here
Wow, that was fast! I didn't think of regular expression, smooth approach (Leaving this here).
On the first glimpse it is looking good:
I didn't test all combinations of possible value
, only half-open range "[--50]"
, list "[44,255]"
and a mix of of closed range and list "[12-20],21,22,[21-32]"
.
I guess the thing tickling my nose the most (which is a good thing, because it is a minor thing) was the use of [0-9]
instead of \d
in the regex.
value_re = re.compile(r"(?P<range>(?P<open>[\[\(])(?P<low>-?\d+\.?\d*)?-(?P<high>-?\d+\.?\d*)?(?P<close>[\]\)]))|(?P<value>-?\d+\.?\d*)")
I guess the thing tickling my nose the most (which is a good thing, because it is a minor thing) was the use of [0-9] instead of \d in the regex.
I had totally forgotten about this notation. Thanks for the tip!
I also noticed that the previous regular expression would not have matched a number which doesn't lead with a digit. For example, "0.72" would be okay, but ".72" would not. So I added a fix to both of these issues
Finishing touches (|\d+
becoming |\d+\.?
):
r"(?P<range>(?P<open>[\[\(])(?P<low>[-+]?(\d*\.\d+|\d+\.?))?-(?P<high>[-+]?(\d*\.\d+|\d+\.?))?(?P<close>[\]\)]))|(?P<value>[-+]?(\d*\.\d+|\d+\.?))"
else you would match "1,2,.2,[3-],[3.3-],[.3-],[-3],[-3.3],[-.3]"
, but not 3., [3.-], [-3.]
.
And while is doesn't make contextual sense in this method, the range "[-]"
is currently a valid input which should break the code (didn't test it).
To capture this case, you could elif m['low'] or m['high']:
Finishing touches (|\d+ becoming |\d+.?)
Nice catch. I've updated the line.
About the "[-]" case, I can imagine an uncommon scenario where someone whats to indicate every pixel in a raster file with any finite value (i.e. not 'NaN'). It's odd for sure, but I would nevertheless elect to keep the feature just in case.
Hello again @euronion. If the current state of this update works for you, then I would go ahead and merge it into the master
Sure. No objections from my side and AFAICT working smoothly for me.
Thanks a lot!
The change is incorporated into master now with version 1.1.8
I currently do not see a way to include (= not exclude) as list of non-consecutive raster types using
excludeRasterType(...)
or any other function for that matter.Excluding a list of non-consecutive raster types can be easily done by iterating the list. Including a list of consecutive raster types can be realised using the
invert=True
switch. But usinginvert=True
to only consider certain raster types with a list of non-consecutive values does not seem possible.Is it possible to include such a feature?
NB: It seems to have been possible in the past (I presume unintentionally), e.g. with this fork (https://github.com/PyPSA/glaes) used in
pypsa-eur
.