sodadata / soda-sql

Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html
https://docs.soda.io/
Apache License 2.0
60 stars 15 forks source link

[Errno 30] Read-only file system when running inside AWS Lambda #200

Closed ivansabik closed 2 years ago

ivansabik commented 2 years ago

Describe the bug When attempting to run soda-sql in a Python AWS Lambda I'm experiencing an error when instantiating ScanBuilder.

To Reproduce Steps to reproduce the behavior:

  1. Create a new lambda with latest soda-sql installed
  2. The lambda should contain this line from sodasql.scan.scan_builder import ScanBuilder
  3. Run the lambda function

Verify you get the following error:

{
  "errorMessage": "[Errno 30] Read-only file system: '/home/sbx_user1051'",
  "errorType": "OSError",
  "stackTrace": [
    "  File \"/var/lang/lib/python3.7/imp.py\", line 234, in load_module\n    return load_source(name, filename, file)\n",
    "  File \"/var/lang/lib/python3.7/imp.py\", line 171, in load_source\n    module = _load(spec)\n",
    "  File \"<frozen importlib._bootstrap>\", line 696, in _load\n",
    "  File \"<frozen importlib._bootstrap>\", line 677, in _load_unlocked\n",
    "  File \"<frozen importlib._bootstrap_external>\", line 728, in exec_module\n",
    "  File \"<frozen importlib._bootstrap>\", line 219, in _call_with_frames_removed\n",
    "  File \"/var/task/lambdas/scan.py\", line 2, in <module>\n    from sodasql.scan.scan_builder import ScanBuilder\n",
    "  File \"/opt/python/sodasql/scan/scan_builder.py\", line 19, in <module>\n    from sodasql.scan.scan import Scan\n",
    "  File \"/opt/python/sodasql/scan/scan.py\", line 34, in <module>\n    from sodasql.scan.warehouse import Warehouse\n",
    "  File \"/opt/python/sodasql/scan/warehouse.py\", line 17, in <module>\n    from sodasql.telemetry.soda_telemetry import SodaTelemetry\n",
    "  File \"/opt/python/sodasql/telemetry/soda_telemetry.py\", line 23, in <module>\n    class SodaTelemetry:\n",
    "  File \"/opt/python/sodasql/telemetry/soda_telemetry.py\", line 42, in SodaTelemetry\n    soda_config = ConfigHelper.get_instance()\n",
    "  File \"/opt/python/sodasql/common/config_helper.py\", line 38, in get_instance\n    ConfigHelper(path)\n",
    "  File \"/opt/python/sodasql/common/config_helper.py\", line 53, in __init__\n    self.init_config_file()\n",
    "  File \"/opt/python/sodasql/common/config_helper.py\", line 94, in init_config_file\n    self.file_system.mkdirs(self.file_system.dirname(destination))\n",
    "  File \"/opt/python/sodasql/scan/file_system.py\", line 53, in mkdirs\n    Path(expanded_path).mkdir(parents=True, exist_ok=True)\n",
    "  File \"/var/lang/lib/python3.7/pathlib.py\", line 1277, in mkdir\n    self.parent.mkdir(parents=True, exist_ok=True)\n",
    "  File \"/var/lang/lib/python3.7/pathlib.py\", line 1273, in mkdir\n    self._accessor.mkdir(self, mode)\n"
  ]
}

Context Include your scan.yml or warehouse.yml when relevant

OS: AWS Lambda Python Version: 3.7 Soda SQL Version: 2.1.3 Warehouse Type: Postgres

ivansabik commented 2 years ago

After taking a closer look, a potential solution would be allowing to operate on the /tmp dir instead of the default. Specifically in " File \"/usr/local/lib/python3.7/site-packages/sodasql/scan/file_system.py\", line 53, in mkdirs\n Path(expanded_path).mkdir(parents=True, exist_ok=True)\n", if the expanded path could be point to /tmp that should do the trick.

Ammar-Azman commented 1 year ago

So how to make this below statement happened?

"After taking a closer look, a potential solution would be allowing to operate on the /tmp dir instead of the default. "