ices-eg / WGJCDP-Cetaceans

Joint Cetaceans Data Portal (development)
Creative Commons Zero v1.0 Universal
4 stars 0 forks source link

data submitters should be invited to test the download function; so they understand the process, and also get some trust in the way it works #267

Closed neil-ices-dk closed 2 years ago

neil-ices-dk commented 2 years ago

data submitters should be invited to test the download function; so they understand the process, and also get some trust in the way it works @NikkiTaylorJNCC

Originally posted by @neil-ices-dk in https://github.com/ices-tools-dev/cetaceans/issues/248#issuecomment-1149825781

neil-ices-dk commented 2 years ago

@cmspinto to clear down the download list so we are at ground zero

NikiClear commented 2 years ago

@neil-ices-dk @cmspinto @pcrjoana @NikkiTaylorJNCC

Reported issue from QC form:

Duplication of records in the effort and sightings tables in the download (this isn't in the original SCASN uploaded data)

Quality Control Check Proposed: The duplication appeared to be because there were 14 different identifiers so each effort record and sighting record were being extracted 14 times (i.e. once for each identifier). The filters I used were harbour porpoise, data from surface vessels and for the Greater North Sea and Celtic Sea. The data I got were all SCANS 2005 data, but due to 7 ships being used and each ship using 2 methodologies there were 14 identifiers. I think the relationship should be a one-to-many to extract the data properly, but instead it is a many-to-many relationship, resulting in duplication of data.

Applicable to: ["Identifiers record\n","Sightings record\n","Effort and Environment record\n"]

Additional comments: I tried downloading some data and found that both effort and sightings records were duplicated 14 times, i.e. each effort segment was recorded 14 times and each sighting was recorded 14 times.

cmspinto commented 2 years ago

Would you have a Download ID I can use to extract the data and see the duplication?

NikiClear commented 2 years ago

Would you have a Download ID I can use to extract the data and see the duplication?

https://cetaceans.ices.dk/Download/121c8af9-5914-419b-b843-d6e04a95d09c

NikkiTaylorJNCC commented 2 years ago

data submitters should be invited to test the download function; so they understand the process, and also get some trust in the way it works @NikkiTaylorJNCC

Originally posted by @neil-ices-dk in #248 (comment)

ORCA have been invited to test the download function and will test soon, feedback to come via the form.

cmspinto commented 2 years ago

Thanks for the heads up, this error was due to our change in the format. It was linking by the file instead of the identifier. Should be working fine but it can be tested with the public data: https://cetaceans.ices.dk/Download/121c8af9-5914-419b-b843-d6e04a95d09c

neil-ices-dk commented 2 years ago

@NikiClear to check that the fix is working please :)

NikiClear commented 2 years ago

@NikiClear to check that the fix is working please :)

Tested and is working now