Closed jamaa closed 2 years ago
Also, the column "Flaeche" in the EFL file should be float instead of integer. This column is also a bit tricky in that rounding errors can cause issues in BlueM.Sim, because the sum of all values for one EZG has to equal 100% (tolerance: 0.001).
You are correct, it is line 785 for the changes in the BOD file data type definitions. For the "Flaeche" in the EFL it is in line 819.
I changed all mentioned definitions in the excel (where it is a lot easier to see) and uploaded it together with a revised CSV.
Regarding the rounding error: Do you think there should be a warning for the user, that the values don't add up to 100% for a given EZG? Or should the plugin change the values on it's own to get to 100% (-> problematic) ?
Thanks for the quick fix! What about the automatically generated geopackage, do datatypes need to be changed in there as well? Or does that use the definitions in the csv as well?
Regarding rounding, I agree that automatically changing values can be problematic. But if the original values in the feature class (before rounding) add up to 100%, the rounded values written to the EFL-file should be automatically adjusted, if necessary. Would that be possible without too much effort? Or am I just imagining that this could ever be a problem, given that there is a tolerance of 0.001?
You're welcome :-) The layer columns in the geopackage are created by the "append_layer_generic"-function (see line 2688), which also appends existing layers if ordered to. This function is completely based on the definitions in the csv.
If a tolerace of 0.001 means a tolerance of 0.1 %, i.e. a sum of 99.9% would be acceptable, there should not be an issue.
The "Flaeche" field of the EFL is encoded as a float with a maximum length of 6 characters. If we remove the integers and the point it leaves space for at least 3 decimal points (4 for values smaller that 10, except "100.00" but that doesn't matter here).
Therefore the deviation / rounding error can't be more than +-0.0005 % per element of a single EZG. That means to exceed the tolerance a single EZG would need 200 parts, all rounded in the same "direction" (+ or -) with their maximum deviation - which is highly unlikely.
In conclusion: if the input data adds up, the tolerances won't be exceeded by rounding the values to fit in the EFL format.
But it's friday evening, so my math could be wrong - please correct me if necessary :-)
Your math sounds very convincing for a friday evening! :-)
Actually the tolerance is only 0.001 % We could change it though I think.
Forgot to say great that the geopackage uses the same definitions!
Actually I don't think I quite understand your math. Here's my attempt:
Given that we have to round all entries to 2 decimal places in order to be able to also fit "100.00" in a space of 6 characters, the largest rounding error we can get from a single entry is 0.005. Say we have only two entries, and both incur this error, the total deviation can already be 0.01.
Original | Rounded | Error | |
---|---|---|---|
80.055 | 80.06 | 0.005 | |
19.945 | 19.95 | 0.005 | |
SUM | 100.000 | 100.010 | |
DEVIATION | 0.0000 | 0.0100 |
If you add more precision by rounding numbers smaller than 100 to 3 decimal places, the possible error per entry is only 0.0005 as you said, but to reach the total allowed deviation of 0.001 you still only need two such entries, right? To get an error larger than that, I have found an example where 4 entries are sufficient:
Original | Rounded | Error | |
---|---|---|---|
50.0555 | 50.056 | 0.0005 | |
30.1165 | 30.117 | 0.0005 | |
5.0005 | 5.001 | 0.0005 | |
14.8275 | 14.828 | 0.0005 | |
SUM | 100.0000 | 100.002 | |
DEVIATION | 0.0000 | 0.002 |
Perhaps this discussion is more academic (and a little fun brain training for me) than practically relevant, though. :-)
Anyway, we got sidetracked a little. I saw in the code that floats are rounded to maximum possible precision depending on the individual value. Adjusting values after rounding depending on the sum of a list of values would probably be quite complicated to implement. I have opened a new issue #9 which we can work on in case this is ever a real problem. The original issue has been fixed, I will create a new release with this fix.
The data type of the columns "anzsch", "boa1", "boa2", "boa3", "boa4", "boa5" and "boa6" should be integer instead of float. See also https://wiki.bluemodel.org/index.php/BOD-File.
I am not sure where exactly in the plugin code this needs to be changed, is it this line? https://github.com/bluemodel/BlueM.QGISInterface/blob/3571d50d20a2b7f455b40f61ebe720ed2ea4925c/inputfiles_overview.csv#L785