open2c / cooler

A cool place to store your Hi-C
https://open2c.github.io/cooler
BSD 3-Clause "New" or "Revised" License
207 stars 50 forks source link

Can't import cooler in aws lambda because multiprocessing is not supported #329

Open pkerpedjiev opened 1 year ago

pkerpedjiev commented 1 year ago

Here is the error message I get:

{
  "errorMessage": "[Errno 38] Function not implemented",
  "errorType": "OSError",
  "requestId": "",
  "stackTrace": [
    "  File \"/usr/local/lib/python3.10/importlib/__init__.py\", line 126, in import_module\n    return _bootstrap._gcd_import(name[level:], package, level)\n",
    "  File \"<frozen importlib._bootstrap>\", line 1050, in _gcd_import\n",
    "  File \"<frozen importlib._bootstrap>\", line 1027, in _find_and_load\n",
    "  File \"<frozen importlib._bootstrap>\", line 1006, in _find_and_load_unlocked\n",
    "  File \"<frozen importlib._bootstrap>\", line 688, in _load_unlocked\n",
    "  File \"<frozen importlib._bootstrap_external>\", line 883, in exec_module\n",
    "  File \"<frozen importlib._bootstrap>\", line 241, in _call_with_frames_removed\n",
    "  File \"/function/app.py\", line 1, in <module>\n    from clodius.tiles.cooler import generate_tiles\n",
    "  File \"/function/clodius/tiles/cooler.py\", line 5, in <module>\n    import cooler\n",
    "  File \"/function/cooler/__init__.py\", line 12, in <module>\n    from . import balance, create, fileops, parallel, tools\n",
    "  File \"/function/cooler/balance.py\", line 8, in <module>\n    from .parallel import partition, split\n",
    "  File \"/function/cooler/parallel.py\", line 42, in <module>\n    lock = Lock()\n",
    "  File \"/function/multiprocess/context.py\", line 68, in Lock\n    return Lock(ctx=self.get_context())\n",
    "  File \"/function/multiprocess/synchronize.py\", line 168, in __init__\n    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)\n",
    "  File \"/function/multiprocess/synchronize.py\", line 63, in __init__\n    sl = self._semlock = _multiprocessing.SemLock(\n"
  ]
}

Reference: https://stackoverflow.com/questions/34005930/multiprocessing-semlock-is-not-implemented-when-running-on-aws-lambda

Not sure what the best solution is. In my use case I'm not using the cooler balance functionality and will likely just wrap that import in a try/else statement. Maybe it's possible to remove the balance import which uses parallel from __init__.py so it's not automatically loaded?

nvictus commented 1 year ago

I don't think the import is the problem, just the Lock() instantiation that lambda doesn't support. Could you try wrapping that in a try block and set it to None if it fails?

pkerpedjiev commented 1 year ago

Won't that fail here: https://github.com/open2c/cooler/blob/master/src/cooler/parallel.py#L256?

nvictus commented 1 year ago

Yeah, that would need guarding if it works. if self.use_lock and lock is not None:

But that tooling is only used internally inside balance anyway.