AtlasOfLivingAustralia / biocache-service

Occurrence & mapping webservices
https://biocache-ws.ala.org.au/ws/
Other
9 stars 26 forks source link

Some downloads are silently being dropped #346

Closed ansell closed 2 years ago

ansell commented 5 years ago

I performed 4 downloads (the same record for each, with different file type--ALA Legacy/Full Darwin Core/Custom Minimal/Custom Full) this morning, and only two of them generated DOIs or emails. If there were errors with the other two, they should have generated emails, but they didn't.

The two that did complete, and which are also in the list of downloads on doi.ala.org.au are:

https://doi.org/10.26197/5bede521ba9e1 (ALA Legacy Format)

https://doi.org/10.26197/5bede576371d7 (Custom Full)

The nginx log on prod-bdown-b5 confirms that all of the requests were accepted by biocache-service with HTTP 200 codes, but the biocache-service log file has no reference to the other downloads at all. The second two downloads after the ALA Legacy Format were sent while the first download was executing, but the fourth was triggered 7 seconds after the email was sent for the first download.

Is it possible that there is a hash key being used that eliminates duplicate downloads where the fields are different?

2018-11-16 08:29:59,086 [http-bio-8080-exec-13] DEBUG au.org.ala.biocache.util.QueryFormatUtils  (QueryFormatUtils.java:544) - escaping lsid urns  urn:catalog:NSW
2018-11-16 08:30:16,023 [biocachedownload-pool-500000-RECORDS_INDEX-0] DEBUG au.org.ala.biocache.service.EmailService  (EmailService.java:64) - Send email to : REDACTED@REDACTED
2018-11-16 08:30:16,080 [biocachedownload-pool-500000-RECORDS_INDEX-0] DEBUG au.org.ala.biocache.dao.JsonPersistentQueueDAOImpl  (JsonPersistentQueueDAOImpl.java:231) - Removing the download from the queue
2018-11-16 08:30:16,080 [biocachedownload-pool-500000-RECORDS_INDEX-0] INFO au.org.ala.biocache.dao.JsonPersistentQueueDAOImpl  (JsonPersistentQueueDAOImpl.java:236) - Deleting /data/cache/downloads/offline1542317337205.json true
2018-11-16 08:30:23,134 [http-bio-8080-exec-4] INFO au.org.ala.biocache.service.AuthService  (AuthService.java:189) - authCache requesting: https://auth.ala.org.au/userdetails/userDetails/getUserDetails?userName=REDACTED
52.64.45.52 - - [16/Nov/2018:08:28:57 +1100] "GET /ws/occurrences/offline/download?hubName=Atlas+of+Living+Australia&file=records-2018-11-16-occurrenceid-ala-legacy-format-csv&mintDoi=true&reasonTypeId=10&searchUrl=https%3A%2F%2Fbiocache.ala.org.au%2Foccurrences%2Fsearch%3Fq%3Doccurrence_id%253A%2522urn%253Acatalog%253ANSW%2520Office%2520of%2520Environment%2520and%2520Heritage%253ABioNet%2520Atlas%2520of%2520NSW%2520Wildlife%253ANSW38060%2522&fileType=csv&qa=none&sourceTypeId=0&email=REDACTED%40REDACTED&doiDisplayUrl=https%3A%2F%2Fbiocache.ala.org.au%2Fdownload%2Fdoi%3Fdoi%3D&q=occurrence_id%3A%22urn%3Acatalog%3ANSW%20Office%20of%20Environment%20and%20Heritage%3ABioNet%20Atlas%20of%20NSW%20Wildlife%3ANSW38060%22 HTTP/1.1" 200 372 "-" "-" "52.64.45.52" request_time=0.368 upstream_response_time=0.368 upstream_connect_time=0.000 upstream_header_time=0.368 upstream_cache_status=-
REDACTED - - [16/Nov/2018:08:28:58 +1100] "GET /ws/occurrences/offline/status/ad425ee9-ad26-309f-80ec-bf0a27da52de-1542317337205 HTTP/1.1" 200 188 "https://biocache.ala.org.au/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:63.0) Gecko/20100101 Firefox/63.0" "REDACTED" request_time=0.001 upstream_response_time=0.001 upstream_connect_time=0.000 upstream_header_time=0.001 upstream_cache_status=-
52.64.45.52 - - [16/Nov/2018:08:29:24 +1100] "GET /ws/occurrences/offline/download?hubName=Atlas+of+Living+Australia&file=records-2018-11-16-occurrenceid-full-darwin-core-csv&mintDoi=true&reasonTypeId=10&dwcHeaders=true&searchUrl=https%3A%2F%2Fbiocache.ala.org.au%2Foccurrences%2Fsearch%3Fq%3Doccurrence_id%253A%2522urn%253Acatalog%253ANSW%2520Office%2520of%2520Environment%2520and%2520Heritage%253ABioNet%2520Atlas%2520of%2520NSW%2520Wildlife%253ANSW38060%2522&fields=rowkey%2Cmodified_date%2Clanguage%2Clicense%2Crightsholder%2Cbibliographic_citation%2Cinstitution_id%2Ccollection_id%2Cdataset_id%2Cinstitution_code%2Ccollection_code%2Cdataset_name%2Cowner_institution_code%2Cbasis_of_record%2Cdynamic_properties%2Coccurrence_id%2Ccatalogue_number%2Crecord_number%2Ccollector%2Cindividual_count%2Craw_sex%2Clife_stage%2Creproductive_condition%2Cbehavior%2Ccultivated%2Cestablishment_means%2Coccurrence_status%2Cpreparations%2Cdisposition%2Cassociated_media%2Cassociated_references%2Cassociated_sequences%2Cassociated_taxa%2Cother_catalog_numbers%2Coccurrence_remarks%2Cprevious_identifications%2Cevent_id%2Cfield_number%2Coccurrence_date%2Cevent_time%2Cstart_day_of_year%2Cend_day_of_year%2Cyear%2Cmonth%2Cday%2Cverbatim_event_date%2Chabitat%2Csampling_protocol%2Csampling_effort%2Cfield_notes%2Cevent_remarks%2Clocation_id%2Chigher_geography%2Ccontinent%2Cwater_body%2Cisland_group%2Cisland%2Ccountry%2Ccountry_code%2Cstate%2Ccounty%2Cmunicipality%2Craw_locality%2Cverbatim_locality%2Cmin_elevation_d%2Cmax_elevation_d%2Cmax_depth_d%2Cmin_depth_d%2Clocation_according_to%2Clocation_remarks%2Clatitude%2Clongitude%2Ccoordinate_uncertainty%2Ccoordinate_precision%2Cverbatim_coordinates%2Craw_latitude%2Cverbatim_latitude%2Craw_longitude%2Cverbatim_longitude%2Craw_datum%2Cverbatim_coordinate_system%2Cverbatim_srs%2Cfootprint_wkt%2Cfootprint_srs%2Cgeoreferenced_by%2Cgeoreferenced_date%2Cgeoreference_protocol%2Cgeoreference_sources%2Cgeoreference_verification_status%2Cgeoreference_remarks%2Cidentification_id%2Cidentification_qualifier%2Ctype_status%2Cidentified_by%2Cidentified_date%2Cidentification_references%2Cidentification_verification_status%2Cidentification_remarks%2Ctaxon_id%2Cscientific_name_id%2Ctaxon_concept_lsid%2Craw_taxon_name%2Ctaxon_name%2Caccepted_name_usage%2Cparent_name_usage%2Coriginal_name_usage%2Cname_published_in%2Chigher_classification%2Ckingdom%2Cphylum%2Cclass%2Corder%2Cfamily%2Cgenus%2Csubgenus%2Cspecific_epithet%2Cinfraspecific_epithet%2Crank%2Cscientific_name_authorship%2Ccommon_name%2Cnomenclatural_code%2Ctaxonomic_status%2Cnomenclatural_status%2Ctaxon_remarks%2Cindividual_id%2Cscientific_name_addendum%2Cidentifier_role%2Cspecies%2Cmeasurement_id%2Craw_identification_qualifier%2Cprovenance%2Cmeasurement_determined_by%2Cmeasurement_determined_date%2Cid%2Crights%2Cmeasurement_value%2Craw_continent%2Ctype_status_qualifier%2Crelated_resource_id%2Cmeasurement_accuracy%2Craw_basis_of_record%2Crelationship_of_resource%2Cmeasurement_type%2Cmeasurement_unit%2Ctype&fileType=csv&qa=none&sourceTypeId=0&email=REDACTED%40REDACTED&doiDisplayUrl=https%3A%2F%2Fbiocache.ala.org.au%2Fdownload%2Fdoi%3Fdoi%3D&q=occurrence_id%3A%22urn%3Acatalog%3ANSW%20Office%20of%20Environment%20and%20Heritage%3ABioNet%20Atlas%20of%20NSW%20Wildlife%3ANSW38060%22 HTTP/1.1" 200 402 "-" "-" "52.64.45.52" request_time=0.026 upstream_response_time=0.026 upstream_connect_time=0.000 upstream_header_time=0.026 upstream_cache_status=-
REDACTED - - [16/Nov/2018:08:29:25 +1100] "GET /ws/occurrences/offline/status/ad425ee9-ad26-309f-80ec-bf0a27da52de-1542317364548 HTTP/1.1" 200 50 "https://biocache.ala.org.au/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:63.0) Gecko/20100101 Firefox/63.0" "REDACTED" request_time=0.002 upstream_response_time=0.002 upstream_connect_time=0.000 upstream_header_time=0.002 upstream_cache_status=-
52.64.45.52 - - [16/Nov/2018:08:29:59 +1100] "GET /ws/occurrences/offline/download?hubName=Atlas+of+Living+Australia&file=records-2018-11-16-occurrenceid-custom-minimal-csv&mintDoi=true&reasonTypeId=10&dwcHeaders=true&searchUrl=https%3A%2F%2Fbiocache.ala.org.au%2Foccurrences%2Fsearch%3Fq%3Doccurrence_id%253A%2522urn%253Acatalog%253ANSW%2520Office%2520of%2520Environment%2520and%2520Heritage%253ABioNet%2520Atlas%2520of%2520NSW%2520Wildlife%253ANSW38060%2522&fields=rowkey%2Clanguage%2Clicense%2Crightsholder%2Cbibliographic_citation%2Cinstitution_id%2Ccollection_id%2Cdataset_id%2Cinstitution_code%2Ccollection_code%2Cdataset_name%2Cowner_institution_code%2Cbasis_of_record%2Cdynamic_properties%2Cprovenance%2Crights%2Ctype%2Coccurrence_id%2Ccatalogue_number%2Crecord_number%2Ccollector%2Cindividual_count%2Craw_sex%2Clife_stage%2Creproductive_condition%2Cbehavior%2Ccultivated%2Cestablishment_means%2Coccurrence_status%2Cpreparations%2Cdisposition%2Cassociated_media%2Cassociated_references%2Cassociated_sequences%2Cassociated_taxa%2Cother_catalog_numbers%2Coccurrence_remarks%2Cindividual_id%2Cid&fileType=csv&qa=none&sourceTypeId=0&email=REDACTED%40REDACTED&doiDisplayUrl=https%3A%2F%2Fbiocache.ala.org.au%2Fdownload%2Fdoi%3Fdoi%3D&q=occurrence_id%3A%22urn%3Acatalog%3ANSW%20Office%20of%20Environment%20and%20Heritage%3ABioNet%20Atlas%20of%20NSW%20Wildlife%3ANSW38060%22 HTTP/1.1" 200 402 "-" "-" "52.64.45.52" request_time=0.207 upstream_response_time=0.207 upstream_connect_time=0.000 upstream_header_time=0.207 upstream_cache_status=-
REDACTED - - [16/Nov/2018:08:29:59 +1100] "GET /ws/occurrences/offline/status/ad425ee9-ad26-309f-80ec-bf0a27da52de-1542317399086 HTTP/1.1" 200 50 "https://biocache.ala.org.au/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:63.0) Gecko/20100101 Firefox/63.0" "REDACTED" request_time=0.001 upstream_response_time=0.001 upstream_connect_time=0.000 upstream_header_time=0.001 upstream_cache_status=-
52.64.45.52 - - [16/Nov/2018:08:30:23 +1100] "GET /ws/occurrences/offline/download?hubName=Atlas+of+Living+Australia&file=records-2018-11-16-occurrenceid-custom-full-csv&mintDoi=true&reasonTypeId=10&dwcHeaders=true&includeMisc=true&searchUrl=https%3A%2F%2Fbiocache.ala.org.au%2Foccurrences%2Fsearch%3Fq%3Doccurrence_id%253A%2522urn%253Acatalog%253ANSW%2520Office%2520of%2520Environment%2520and%2520Heritage%253ABioNet%2520Atlas%2520of%2520NSW%2520Wildlife%253ANSW38060%2522&fields=rowkey%2Clanguage%2Clicense%2Crightsholder%2Cbibliographic_citation%2Cinstitution_id%2Ccollection_id%2Cdataset_id%2Cinstitution_code%2Ccollection_code%2Cdataset_name%2Cowner_institution_code%2Cbasis_of_record%2Cdynamic_properties%2Cprovenance%2Crights%2Ctype%2Coccurrence_id%2Ccatalogue_number%2Crecord_number%2Ccollector%2Cindividual_count%2Craw_sex%2Clife_stage%2Creproductive_condition%2Cbehavior%2Ccultivated%2Cestablishment_means%2Coccurrence_status%2Cpreparations%2Cdisposition%2Cassociated_media%2Cassociated_references%2Cassociated_sequences%2Cassociated_taxa%2Cother_catalog_numbers%2Coccurrence_remarks%2Cindividual_id%2Cid%2Cprevious_identifications%2Cevent_id%2Cfield_number%2Coccurrence_date%2Cevent_time%2Cstart_day_of_year%2Cend_day_of_year%2Cyear%2Cmonth%2Cday%2Cverbatim_event_date%2Chabitat%2Csampling_protocol%2Csampling_effort%2Cfield_notes%2Cevent_remarks%2Clocation_id%2Chigher_geography%2Ccontinent%2Cwater_body%2Cisland_group%2Cisland%2Ccountry%2Ccountry_code%2Cstate%2Ccounty%2Cmunicipality%2Craw_locality%2Cverbatim_locality%2Cmin_elevation_d%2Cmax_elevation_d%2Cmax_depth_d%2Cmin_depth_d%2Clocation_according_to%2Clocation_remarks%2Clatitude%2Clongitude%2Ccoordinate_uncertainty%2Ccoordinate_precision%2Cverbatim_coordinates%2Craw_latitude%2Cverbatim_latitude%2Craw_longitude%2Cverbatim_longitude%2Craw_datum%2Cverbatim_coordinate_system%2Cverbatim_srs%2Cfootprint_wkt%2Cfootprint_srs%2Cgeoreferenced_by%2Cgeoreferenced_date%2Cgeoreference_protocol%2Cgeoreference_sources%2Cgeoreference_verification_status%2Cgeoreference_remarks%2Cidentification_id%2Cidentification_qualifier%2Ctype_status%2Cidentified_by%2Cidentified_date%2Cidentification_references%2Cidentification_verification_status%2Cidentification_remarks%2Cidentifier_role%2Ctype_status_qualifier%2Ctaxon_id%2Cscientific_name_id%2Ctaxon_concept_lsid%2Craw_taxon_name%2Ctaxon_name%2Caccepted_name_usage%2Cparent_name_usage%2Coriginal_name_usage%2Cname_published_in%2Chigher_classification%2Ckingdom%2Cphylum%2Cclass%2Corder%2Cfamily%2Cgenus%2Csubgenus%2Cspecific_epithet%2Cinfraspecific_epithet%2Crank%2Cscientific_name_authorship%2Ccommon_name%2Cnomenclatural_code%2Ctaxonomic_status%2Cnomenclatural_status%2Ctaxon_remarks%2Cscientific_name_addendum%2Cspecies%2Cmeasurement_id%2Cmeasurement_determined_by%2Cmeasurement_determined_date%2Cmeasurement_value%2Cmeasurement_accuracy%2Cmeasurement_type%2Cmeasurement_unit%2Caust_conservation%2Cstate_conservation%2Cspecies_group%2Cspecies_subgroup%2Cel_p%2Ccl_p&fileType=csv&qa=includeall&sourceTypeId=0&email=REDACTED%40REDACTED&doiDisplayUrl=https%3A%2F%2Fbiocache.ala.org.au%2Fdownload%2Fdoi%3Fdoi%3D&q=occurrence_id%3A%22urn%3Acatalog%3ANSW%20Office%20of%20Environment%20and%20Heritage%3ABioNet%20Atlas%20of%20NSW%20Wildlife%3ANSW38060%22 HTTP/1.1" 200 372 "-" "-" "52.64.45.52" request_time=0.023 upstream_response_time=0.023 upstream_connect_time=0.000 upstream_header_time=0.023 upstream_cache_status=-
REDACTED - - [16/Nov/2018:08:30:23 +1100] "GET /ws/occurrences/offline/status/ad425ee9-ad26-309f-80ec-bf0a27da52de-1542317423147 HTTP/1.1" 200 188 "https://biocache.ala.org.au/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:63.0) Gecko/20100101 Firefox/63.0" "REDACTED" request_time=0.002 upstream_response_time=0.002 upstream_connect_time=0.000 upstream_header_time=0.002 upstream_cache_status=-
djtfmartin commented 2 years ago

Closing historic issue. If this is still a problem, please reopen.