SUMup-database / SUMup-data-suggestion

The issue tab of this reppository is used to interact on new and old data that could be added to SUMup.
1 stars 0 forks source link

duplicate Greenland pit data #75

Open jasonebox opened 9 months ago

jasonebox commented 9 months ago

The good news here is that correct data appear in SUMup, while for several sites there are duplicate values that lack date. The duplicate/bad site data use the correct longer version of the name, e.g. Crawford Point while the correct data use the site name abreviation, in this case CP1.

The best-practice fix for this issue, detailed in several examples below, would seem to be to use the full name, e.g. South Dome instead of SDM, along with the more correct data (that include the measurement date, i.e., the "end_date")

first example:

bad news: measurement_ids: 304530 and 304531 have identical values for site name: Crawford Point and another error is for these measurement_ids start_year = end_year

good news: ... the measurement_id 304517 for name: CP1 has the correct smb value and the correct start_year and end_year

the fix: as I see it, is to remove measurement_ids: 304530 and 304531

jasonebox commented 9 months ago

similarly for name: DYE-2, there are duplicates where the good values have the name DY2. Therefore, the fix seems to be to simply remove measurement_ids: 304539 through 304544

image

jasonebox commented 9 months ago

another redundancy is measurement_id: 304516 for CP1 that is empty for start date and end date and has an identical smb with measurement_id: 304515

the fix appears to be to remove measurement_id: 304516

image

jasonebox commented 9 months ago

similar issue, redundancy and end_date issue for South Dome with correct data under the name SDM image here are the duplicate/erroneous South Dome data image

the best-practice fix for this general issue would seem to be to use the full name, e.g. South Dome instead of SDM, along with the more correct data (that include the measurement date, the end_date)

jasonebox commented 9 months ago

For Saddle, I find no SDL data (good news I think), but there are date and duplication issues, years 2008, 2010, 2013: image

BaptisteVandecrux commented 6 months ago

I identified the origin of the duplication and removed it from the next release. Thanks for the heads up!