Open mjy opened 4 months ago
Not sure exactly what is meant here. Are we talking extra sig. figs?
@Mesibov can you clarify? likely sig-figs is his issue
select distinct "coordinateUncertaintyInMeters" from dwc_occurrences where project_id = 1;
Hmm- why is 74m
there, one of these things is not like the others.
748.8405857730946
74m
75.4281763295557
750.0
7500.0
751.6423404966713
754.5399279502357
757.9651323351706
76.5570962975804
764.5902389129819
7686.0
775.3439805731413
776.41865961447
776.520386311501
7796.09457165164
7800.0
7817.21069747346
782.0
784.089635815655
7882.24344094419
79.7971421413789
790.2629968469731
800.0
8000.0
8003.34773848392
805.7295944391607
8058.92561065185
809.596219116548
818.8545608754471
820.3306502100987
8227.6571101691
823.546822217106
828.724703149433
831.692683548881
8316.82043332159
836.481716290153
837.132748406671
8388.43747170582
846.539514701933
8500.0
857.1139252044209
863.2552285086282
8684.861133282946
871.2369310259176
8945.321017818254
90.0
900.0
9000.0
918.537006879216
929.516467970883
935.6765476688439
9436.505785229396
96.93616071712428
9649.9406095305
9684.25602192417
969.0
973.779434524857
976.713724947332
985.3808236227832
99.7668002689468
997.376966818365
9972.589691929623
999.0
999.283690376198
9999.0
This is actually a TW issue. Uncertainty creator adds crazy amounts of sig figs.
There is an emerging line of thought that doesn't worry about this sig-fig, as they are representations of the conversion of one-unit to another via the limitations of the calculation. For example these were feet converted to meters values likely. In other words, if we want to back-cacluate the value of an individual record in feet then we want to use as much accuracy as possible. Now, when doing calculations across records we'd have to make some sig-fig descisions, however clearly with these collecitons data nobody is recording at this level, so this is a very diffferent issue than in chemistry and physics. Furthermore, when aggregating across collections we're going to have even less certainty, so anyone actually looking at their data is going to conservatively round far beyond these exact values.
This is all to say, we might not change this. :)
https://github.com/SpeciesFileGroup/taxonworks/issues/3946 - Can move discussion there.
Good morning, Matt and Tom. In cUIM, invalid entries are: 100000m 10000m 10000M 1000m 100m 10m 1km 35000m 43M 5000m ±50m 50m 6000m 74m -74.98584
The sigfig issue is important because in cUIM values like "748.8405857730946" the numbers past the decimal point are pure data noise. cUIM is an estimate based on software or human judgment and neither the software nor the human estimator provides an error (plus or minus N). Rounding to whole meters or even higher (as in "35000") is justified.
I've seen cUIM used to build uncertainty circles in GIS and there is no significant difference so far as this building is concerned between "748.8405857730946" and "749". Back-calculation to feet would be an odd thing to do.
Note also that the method of estimating cUIM has not been documented for each record. I'm not saying it should be, just pointing out that in the soup of cUIMs estimated by various means and by various people and programs, the practical basis for comparison is whole meters, and that's what Darwin Core expects: "The horizontal distance (in meters) from the given dwc:decimalLatitude and dwc:decimalLongitude describing the smallest circle containing the whole of the dcterms:Location." (DwC)