girder / girder_worker

Distributed task execution engine with Girder integration, developed by Kitware
http://girder-worker.readthedocs.io/
Apache License 2.0
34 stars 30 forks source link

converter_path raises NetworkXNoPath exception without specifying src and dest types #32

Open cdeepakroy opened 8 years ago

cdeepakroy commented 8 years ago

@zachmullen @danlamanna

worker.format.converter_path raises NetworkXNoPath but does not specify source and target types between which the conversion is being made.

Below is my task spec:

{
  "auto_convert": true,
  "cleanup": true,
  "inputs": {
    "foreground_threshold": {
      "data": "160",
      "format": "json",
      "mode": "inline",
      "type": "number"
    },
    "inputImageFile": {
      "api_url": "http://localhost:8080/api/v1",
      "format": "string",
      "id": "573cc705a848737a396e529e",
      "mode": "girder",
      "name": "Easy1.png",
      "resource_type": "item",
      "token": "NOWAfxaQe3Es7SfGMa4VbEDFhnamHhKyJCjBX54nh0Yv1yGNd1eJxZcN6Gh0EplM",
      "type": "string"
    },
    "local_max_search_radius": {
      "data": "10",
      "format": "json",
      "mode": "inline",
      "type": "number"
    },
    "max_radius": {
      "data": "7",
      "format": "json",
      "mode": "inline",
      "type": "number"
    },
    "min_nucleus_area": {
      "data": "80",
      "format": "json",
      "mode": "inline",
      "type": "number"
    },
    "min_radius": {
      "data": "4",
      "format": "json",
      "mode": "inline",
      "type": "number"
    },
    "stain_1": {
      "data": "hematoxylin",
      "format": "json",
      "mode": "inline",
      "type": "string"
    },
    "stain_2": {
      "data": "eosin",
      "format": "json",
      "mode": "inline",
      "type": "string"
    },
    "stain_3": {
      "data": "null",
      "format": "json",
      "mode": "inline",
      "type": "string"
    }
  },
  "jobInfo": {
    "headers": {
      "Girder-Token": "S3N0OqmQ5dOHXW4YMpNKT8PE5I1jnxWlBMacouZ0PkuVby9VgM5G7tFZ6VNYEnM1"
    },
    "logPrint": true,
    "method": "PUT",
    "reference": "573cca08a8487305d18b1a4f",
    "url": "http://localhost:8080/api/v1/job/573cca08a8487305d18b1a4f"
  },
  "outputs": {
    "outputNucleiAnnotationFile": {
      "api_url": "http://localhost:8080/api/v1",
      "format": "string",
      "mode": "girder",
      "name": "Easy1_nuclei.anot",
      "parent_id": "573cc72ca848737a396e52a0",
      "parent_type": "folder",
      "token": "NOWAfxaQe3Es7SfGMa4VbEDFhnamHhKyJCjBX54nh0Yv1yGNd1eJxZcN6Gh0EplM",
      "type": "string"
    },
    "outputNucleiMaskFile": {
      "api_url": "http://localhost:8080/api/v1",
      "format": "string",
      "mode": "girder",
      "name": "Easy1_seg.png",
      "parent_id": "573cc72ca848737a396e52a0",
      "parent_type": "folder",
      "token": "NOWAfxaQe3Es7SfGMa4VbEDFhnamHhKyJCjBX54nh0Yv1yGNd1eJxZcN6Gh0EplM",
      "type": "string"
    }
  },
  "task": {
    "container_args": [
      "NucleiSegmentation",
      "--foreground_threshold",
      "160",
      "--local_max_search_radius",
      "10",
      "--max_radius",
      "7",
      "--min_nucleus_area",
      "80",
      "--min_radius",
      "4",
      "--stain_1",
      "hematoxylin",
      "--stain_2",
      "eosin",
      "--stain_3",
      "null",
      "/mnt/girder_worker/data/Easy1.png",
      "/mnt/girder_worker/data/Easy1_seg.png",
      "/mnt/girder_worker/data/Easy1_nuclei.anot"
    ],
    "docker_image": "dsarchive/histomicstk:dev",
    "inputs": [
      {
        "format": "string",
        "id": "inputImageFile",
        "name": "Input Image",
        "target": "filepath",
        "type": "string"
      },
      {
        "default": {
          "data": 160,
          "format": "number"
        },
        "format": "number",
        "id": "foreground_threshold",
        "type": "number"
      },
      {
        "default": {
          "data": 10,
          "format": "number"
        },
        "format": "number",
        "id": "local_max_search_radius",
        "type": "number"
      },
      {
        "default": {
          "data": 7,
          "format": "number"
        },
        "format": "number",
        "id": "max_radius",
        "type": "number"
      },
      {
        "default": {
          "data": 80,
          "format": "number"
        },
        "format": "number",
        "id": "min_nucleus_area",
        "type": "number"
      },
      {
        "default": {
          "data": 4,
          "format": "number"
        },
        "format": "number",
        "id": "min_radius",
        "type": "number"
      },
      {
        "default": {
          "data": "hematoxylin",
          "format": "string"
        },
        "format": "string",
        "id": "stain_1",
        "type": "string"
      },
      {
        "default": {
          "data": "eosin",
          "format": "string"
        },
        "format": "string",
        "id": "stain_2",
        "type": "string"
      },
      {
        "default": {
          "data": "null",
          "format": "string"
        },
        "format": "string",
        "id": "stain_3",
        "type": "string"
      }
    ],
    "mode": "docker",
    "name": "NucleiSegmentation",
    "outputs": [
      {
        "format": "string",
        "id": "outputNucleiMaskFile",
        "name": "Output Nuclei Segmentation Mask",
        "path": "Easy1_seg.png",
        "target": "filepath",
        "type": "string"
      },
      {
        "format": "string",
        "id": "outputNucleiAnnotationFile",
        "name": "Output Nuclei Annotation File",
        "path": "Easy1_nuclei.anot",
        "target": "filepath",
        "type": "string"
      }
    ],
    "pull_image": true
  },
  "validate": false
}

which raises the following exception:

<class 'networkx.exception.NetworkXNoPath'>: 
  File "/media/common/EmoryImageAnnotationPlatform/code/girder_worker/srclnx/girder_worker/__main__.py", line 28, in run
    retval = girder_worker.run(*pargs, **kwargs)
  File "girder_worker/utils.py", line 295, in wrapped
    return fn(*args, **kwargs)
  File "girder_worker/__init__.py", line 277, in run
    {'task_input': task_input, 'fetch': False}, **kwargs))
  File "girder_worker/__init__.py", line 150, in convert
    Validator(type, output['format'])):
  File "girder_worker/format/__init__.py", line 113, in converter_path
    raise NetworkXNoPath
cdeepakroy commented 8 years ago

It would be helpful to print the source and target types when that exception is raised for easy debugging

danlamanna commented 8 years ago

It looks like there exists a type/format combination in your spec that isn't registered in the worker.

Looking at your type/formats I would guess it's because string/string should be string/text. Let me know if that doesn't fix it for you.

zachmullen commented 8 years ago

Indeed. There is also an actual bug here that we should fix, which is that the exception that we propagate in this case is pretty unhelpful. We should catch it and raise one of our own that contains information about the spec that caused the breakage, most importantly the source and dest types and formats, and the id of the input or output.

danlamanna commented 8 years ago

The functions that determine how to go from one format to another have no idea what piece of data they're dealing with, and I would prefer that if possible.

I initially implemented it a bit differently, which was to remove this try/except. In this case it would have just thrown an Exception stating No such validator string/string. Does this work for you?

jeffbaumes commented 8 years ago

I agree that No such validator string/string is the correct error in this case so we should do that (and add a test for it). We should also implement and test that two valid formats with no path produces No path from a/b to a/c instead of NetworkXNoPath.

cdeepakroy commented 8 years ago

No such validator string/string is a good start. But as zach said it would be nice to know the id of the input/output which is causing this error. If there is no way of knowing the ID in the converter_path then, maybe raise something like No such validator string/string there and catch it somewhere else where the ID can be obtained and raise another exception there and give the ID.

zachmullen commented 8 years ago

My suggestion would be to wrap this logic in a try/except block and then raise something containing the types & formats as well as the ID. Within this block we should change it to two different try/excepts, one for the source tuple and one for the dest tuple, and if we raise an exception from that it should include the information from the offending tuple. Does that make sense?

cdeepakroy commented 8 years ago

+1 for what zach suggests

cdeepakroy commented 8 years ago

Also, can we use 'string' instead of 'text' for non-json format of string?

All other simple types seem to be like that except this.