aws-samples / annotate-medical-images-in-dicom-server-and-build-ml-models-on-amazon-sagemaker

MIT No Attribution
19 stars 15 forks source link

Label adjustment (review) support #9

Open athewsey opened 2 years ago

athewsey commented 2 years ago

Issue #, if available: #8

Description of changes:

Limitations / Open questions:

  1. Currently the name of the previous annotation attribute on input manifest is assumed hard-coded to prevAnnotations (meaning users need to transform the output manifest of their initial labelling job before using it as the input for an adjustment job). An alternative would be to dynamically create the final liquid.html file at the point of use, and accept a parameter controlling what this attribute name is / whether it exists: For example as used (here in the "Set up the custom task template" section of this notebook). But this adds an extra layer of abstraction.
    • This also affects the pre-labelling Lambda function as currently written
  2. The data currently saved to SMGT output by the template is not actually sufficient to fully reconstruct labelling tool state: For example there's nothing to differentiate between rectangle and ellipse ROIs. For this initial draft I chose to just hack around it rather than change the output format of the job, but I think it'd be better to extend the job output?
  3. The consolidation Lambda here doesn't actually do any consolidation really... It just outputs the list of all annotations. So for review I simply took the assumption that exactly one worker had annotated the original. Generally I think there's a fair amount of refactoring that could be done of what's in the objects passed between each stage to try and tidy up?

Assumptions of current draft:

An original input manifest entry looks something like (in single line):

{
  "source": "59475f39-919111b5-4f82c54c-09303b20-12345678",
  "labels": "Atelectasis,Cardiomegaly,No Finding"
  // Or "labels": ["Atelectasis", "Cardiomegaly, "No Finding"]
}

A second-round (review) input manifest entry looks something like (in single line):

{
  "source":"59475f39-919111b5-4f82c54c-09303b20-12345678",
  "labels": "Atelectasis,Cardiomegaly,No Finding",
  // Or "labels": ["Atelectasis", "Cardiomegaly, "No Finding"]
  "prevAnnotation": {
    "annotationsFromAllWorkers": [
      // Exactly one previous worker:
      {
        "workerId":"private.us-east-1.1234567890abcdef",
        "annotationData": {
          // content is stringified (which I think we could avoid), and
          // content.label is again stringified within that
          "content": "{\"disease\":{\"Atelectasis\":false,\"Cardiomegaly\":false,\"No Finding\":true},\"labels\":\"{\\\"label\\\":[\\\"Pleural Other\\\"],\\\"imageurl\\\":\\\"https://abcdef1234567890.cloudfront.net/instances/59475f39-919111b5-4f82c54c-09303b20-12345678/file\\\",\\\"ROI\\\":{\\\"start\\\":{\\\"x\\\":234.4744744744745,\\\"y\\\":206.9525502064565,\\\"highlight\\\":true,\\\"active\\\":false},\\\"end\\\":{\\\"x\\\":292.9009009009009,\\\"y\\\":238.472069725976,\\\"highlight\\\":true,\\\"active\\\":false},\\\"boundingBox\\\":{\\\"width\\\":130.11666870117188,\\\"height\\\":65,\\\"left\\\":381,\\\"top\\\":257.19999694824224}}}\"}"
        }
      }
    ]
  },
  // Optional, ignored -metadata (would have originally been
  // 'actual-smgt-job-1-name-metadata' in job 1 output manifest)
  "prevAnnotation-metadata": {
    "type":"groundtruth/custom",
    "job-name":"actual-smgt-job-1-name",
    "human-annotated":"yes",
    "creation-date":"2022-04-22T06:00:25.913000"
  }
}


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.