comic / grand-challenge.org

A platform for end-to-end development of machine learning solutions in biomedical imaging
https://grand-challenge.org
Apache License 2.0
176 stars 51 forks source link

Wrong output creation for json type outputs not stored in db #2702

Closed amickan closed 1 year ago

amickan commented 1 year ago

Ward Hendrix reported that his algorithm (Lung nodule detection in routine clinical CT) which uses the nodule-locations interface keeps failing with the following error: "The outputfile nodule-locations.json is not valid" This is not a Json decoding issue, the json he is outputting has the correct format. This is a ValidationError arising on CIV creation.

I traced back the code, and I think the issue arises here:

def get_outputs(self, *, output_interfaces):
        outputs = []
        with transaction.atomic():
            # Atomic block required as create_instance needs to
            # create interfaces in order to store the files
            for interface in output_interfaces:
                if interface.is_image_kind:
                    res = self._create_images_result(interface=interface)
                elif interface.is_json_kind:
                    res = self._create_json_result(interface=interface)
                else:
                    res = self._create_file_result(interface=interface)

                outputs.append(res)

        return outputs

For outputs of json type that are not stored in the database, we should create a file result, but instead we're trying to create a json result, which will fail because we try to set a value instead of a file.

So I think the solution is to check if interface.requires_file and to move that check before the is_json_kind check:

def get_outputs(self, *, output_interfaces):
        outputs = []
        with transaction.atomic():
            # Atomic block required as create_instance needs to
            # create interfaces in order to store the files
            for interface in output_interfaces:
                if interface.is_image_kind:
                    res = self._create_images_result(interface=interface)
                elif interface.requires_file:
                   res =  self._create_file_result(interface=interface)
                else:
                    res = self._create_json_result(interface=interface)

                outputs.append(res)

        return outputs
amickan commented 1 year ago

That would mean that creating json outputs not stored in the db is generally broken since this PR.

jmsmkn commented 1 year ago

That would mean that creating json outputs not stored in the db is generally broken since this PR.

What does this mean? Jobs were incorrectly rejected? Or that values were stored for files or files for values, etc?

jmsmkn commented 1 year ago

Inspecting the Job failures, only 4 jobs have failed with a validation failure message in the past two months, three of those were for the algorithm that reported this issue, so I think "generally broken" is not true. There were several challenges that used output like this for MICCAI, so if jobs were regularly failing we would have heard about it sooner.

amickan commented 1 year ago

Ok, well that's good news. I'm not sure what's special about this case then. If what I suggest above is true, then you would expect all jobs with json file outputs to fail with a ValidationError, but if that's not the case, my analysis might be wrong.

amickan commented 1 year ago

I spent some more time debugging this, and realized that I was wrong. Even though we create a json output, in the actual create_instance method, we take the value and dump into a file for interfaces that are not stored in the db:

    def create_instance(self, *, image=None, value=None, fileobj=None):
        civ = ComponentInterfaceValue.objects.create(interface=self)

        if image:
            civ.image = image
        elif fileobj:
            container = File(fileobj)
            civ.file.save(Path(self.relative_path).name, container)
        elif self.saved_in_object_store:
            civ.file = ContentFile(
                json.dumps(value).encode("utf-8"),
                name=Path(self.relative_path).name,
            )
        else:
            civ.value = value

So saving outputs of json file type is not broken. It should work. But now I'm back at square one with trying to figure out what is causing the issues for Ward's algorithm.

jmsmkn commented 1 year ago

Could you maybe make a test that reproduces it? Even with fixtures if needed?

jmsmkn commented 1 year ago

Good to know that it's not a bigger issue though!

amickan commented 1 year ago

Yeah that's the problem, I'm not able to reproduce it. Ward sent an example output file from his algorithm, and if I mock the part of _create_json_result that calls the S3 client and reads the file object downloaded from S3 and instead pass the contents of the example file, the CIV gets created just fine, no errors.

So the issue must be somewhere within this part of the code, or with the initial upload of the produced json file to the S3 outputs bucket. Not sure how to test that further though.

        with io.BytesIO() as fileobj:
            self._s3_client.download_fileobj(
                Fileobj=fileobj,
                Bucket=settings.COMPONENTS_OUTPUT_BUCKET_NAME,
                Key=key,
            )
            fileobj.seek(0)
jmsmkn commented 1 year ago

I would be astounded if there is something wrong with those two lines.

OK how about a sanity check - put that file in the simplest Algorithm container you can make an upload it, see if you can reproduce it locally, or on grand challenge?

amickan commented 1 year ago

Yes, I can reproduce it. The parse outputs task fails with the following error. It's trying to validate a json that contains both the outputs and the inputs against the multiple points schema - that's weird. I can't go back far enough in time in the Django admin on GC to see if Ward's jobs produced the same error or if this is because I'm outputting the json wrong in my dummy algorithm.

django.core.exceptions.ValidationError: {'all': ["JSON does not fulfill schema: [{'outputs': [{'type': 'Multiple points', 'points': [{'point': [122.19, -27.73, 1840.87]}, {'point': [88.87, 5.86, 1824.93]}], 'version': {'major': 1, 'minor': 0}}], 'inputs': [{'type': 'metaio_image', 'filename': 'f32fc983-087f-4a6b-891a-3c87d4caf745.mha'}], 'error_messages': []}] is not of type 'object'\n\nFailed validating 'type' in schema[0]:\n {'additionalProperties': False,\n 'properties': {'name': {'type': 'string'},\n 'points': {'items': {'allOf': [{'$ref': '#/definitions/point-object'}]},\n 'type': 'array'},\n 'type': {'enum': ['Multiple points']},\n 'version': {'$ref': '#/definitions/version-object'}},\n 'required': ['version', 'type', 'points'],\n 'type': 'object'}\n\nOn instance:\n [{'error_messages': [],\n 'inputs': [{'filename': 'f32fc983-087f-4a6b-891a-3c87d4caf745.mha',\n 'type': 'metaio_image'}],\n 'outputs': [{'points': [{'point': [122.19, -27.73, 1840.87]},\n {'point': [88.87, 5.86, 1824.93]}],\n 'type': 'Multiple points',\n 'version': {'major': 1, 'minor': 0}}]}]"]}

Full trace back:
Traceback (most recent call last): File "/app/grandchallenge/components/backends/base.py", line 286, in _create_json_result civ = interface.create_instance(value=result) File "/app/grandchallenge/components/models.py", line 760, in create_instance civ.full_clean() File "/opt/poetry/.venv/lib/python3.8/site-packages/django/db/models/base.py", line 1251, in full_clean raise ValidationError(errors) django.core.exceptions.ValidationError: {'__all__': ["JSON does not fulfill schema: [{'outputs': [{'type': 'Multiple points', 'points': [{'point': [122.19, -27.73, 1840.87]}, {'point': [88.87, 5.86, 1824.93]}], 'version': {'major': 1, 'minor': 0}}], 'inputs': [{'type': 'metaio_image', 'filename': 'f32fc983-087f-4a6b-891a-3c87d4caf745.mha'}], 'error_messages': []}] is not of type 'object'\n\nFailed validating 'type' in schema[0]:\n {'additionalProperties': False,\n 'properties': {'name': {'type': 'string'},\n 'points': {'items': {'allOf': [{'$ref': '#/definitions/point-object'}]},\n 'type': 'array'},\n 'type': {'enum': ['Multiple points']},\n 'version': {'$ref': '#/definitions/version-object'}},\n 'required': ['version', 'type', 'points'],\n 'type': 'object'}\n\nOn instance:\n [{'error_messages': [],\n 'inputs': [{'filename': 'f32fc983-087f-4a6b-891a-3c87d4caf745.mha',\n 'type': 'metaio_image'}],\n 'outputs': [{'points': [{'point': [122.19, -27.73, 1840.87]},\n {'point': [88.87, 5.86, 1824.93]}],\n 'type': 'Multiple points',\n 'version': {'major': 1, 'minor': 0}}]}]"]} During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/grandchallenge/components/tasks.py", line 825, in parse_job_outputs outputs = executor.get_outputs( File "/app/grandchallenge/components/backends/base.py", line 82, in get_outputs res = self._create_json_result(interface=interface) File "/app/grandchallenge/components/backends/base.py", line 296, in _create_json_result raise ComponentException( grandchallenge.components.backends.exceptions.ComponentException: The output file 'nodule-locations.json' is not valid During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/poetry/.venv/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task R = retval = fun(*args, **kwargs) File "/opt/poetry/.venv/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__ return self.run(*args, **kwargs) File "/app/grandchallenge/components/tasks.py", line 830, in parse_job_outputs raise PriorStepFailed("Could not parse outputs") grandchallenge.components.exceptions.PriorStepFailed: Could not parse outputs
jmsmkn commented 1 year ago

I would double check your algorithm outputs as that doesn’t look right at all. That looks like you’re putting the development results.json in the nodule outputs.

jmsmkn commented 1 year ago

Also you can rerun the job to get the up to date error messages.

amickan commented 1 year ago

Yes, I had made a mistake in writing the outputs. I fixed that and reran the job. I can reproduce the error, it's a schema validation error:

Traceback (most recent call last): File "/app/grandchallenge/components/backends/base.py", line 286, in _create_json_result civ = interface.create_instance(value=result) File "/app/grandchallenge/components/models.py", line 760, in create_instance civ.full_clean() File "/opt/poetry/.venv/lib/python3.8/site-packages/django/db/models/base.py", line 1251, in full_clean raise ValidationError(errors) django.core.exceptions.ValidationError: {'all': ["JSON does not fulfill schema: [{'type': 'Multiple points', 'points': [{'point': [122.19, -27.73, 1840.87]}, {'point': [88.87, 5.86, 1824.93]}], 'version': {'major': 1, 'minor': 0}}] is not of type 'object'\n\nFailed validating 'type' in schema[0]:\n {'additionalProperties': False,\n 'properties': {'name': {'type': 'string'},\n 'points': {'items': {'allOf': [{'$ref': '#/definitions/point-object'}]},\n 'type': 'array'},\n 'type': {'enum': ['Multiple points']},\n 'version': {'$ref': '#/definitions/version-object'}},\n 'required': ['version', 'type', 'points'],\n 'type': 'object'}\n\nOn instance:\n [{'points': [{'point': [122.19, -27.73, 1840.87]},\n {'point': [88.87, 5.86, 1824.93]}],\n 'type': 'Multiple points',\n 'version': {'major': 1, 'minor': 0}}]"]}

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/app/grandchallenge/components/tasks.py", line 825, in parse_job_outputs outputs = executor.get_outputs( File "/app/grandchallenge/components/backends/base.py", line 82, in get_outputs res = self._create_json_result(interface=interface) File "/app/grandchallenge/components/backends/base.py", line 296, in _create_json_result raise ComponentException( grandchallenge.components.backends.exceptions.ComponentException: The output file 'nodule-locations.json' is not valid

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/opt/poetry/.venv/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task R = retval = fun(*args, *kwargs) File "/opt/poetry/.venv/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__ return self.run(args, **kwargs) File "/app/grandchallenge/components/tasks.py", line 830, in parse_job_outputs raise PriorStepFailed("Could not parse outputs") grandchallenge.components.exceptions.PriorStepFailed: Could not parse outputs

jmsmkn commented 1 year ago

The json being validated is an array, not an object.