open-contracting / kingfisher-collect

Downloads OCDS data and stores it on disk
https://kingfisher-collect.readthedocs.io
BSD 3-Clause "New" or "Revised" License
13 stars 12 forks source link

moldova: Inherit from BaseSpider and re-implement parse method #1056

Closed sentry-io[bot] closed 5 months ago

sentry-io[bot] commented 7 months ago

This logic:

        data = response.json()

        if data.get('name') == 'Error':
            data['http_code'] = response.status
            yield self.build_file_error_from_response(response, errors=data)
            return

Needs to not only be on parse_list but parse as well.

It caused 35k errors in Sentry.


UnknownFormatError: top-level JSON value is a non-OCDS object

Sentry Issue: REGISTRY-KINGFISHER-PROCESS-9C

UnknownFormatError: top-level JSON value is a non-OCDS object
  File "process/management/commands/file_worker.py", line 81, in callback
    upgraded_collection_file_id = process_file(collection_file)
  File "process/management/commands/file_worker.py", line 122, in process_file
    data_type = _get_data_type(collection_file)
  File "process/management/commands/file_worker.py", line 157, in _get_data_type
    detected_format, is_concatenated, is_array = detect_format(collection_file.filename)

Source moldova yields an unknown or unsupported format, skipping
jpmckinney commented 7 months ago

Also, we should maybe retry these requests, since 35k missing contracting processes is a lot.