databrickslabs / cicd-templates

Manage your Databricks deployments and CI with code.
Other
202 stars 100 forks source link

Cannot read the python file dbfs:/Shared/dbx/projects/databricks_pipelines/../artifacts/tests/integration/sample_test.py #69

Closed pebabion closed 3 years ago

pebabion commented 3 years ago

I am following the template with Databricks hosted on AWS and running with GitHub actions but bumped into the below error.

What could cause an error like this? Cannot read the python file...

Appreciate any help. Thank you!

Run dbx launch --job=databricks-pipelines-sample-integration-test --as-run-submit --trace
  dbx launch --job=databricks-pipelines-sample-integration-test --as-run-submit --trace
  shell: /usr/bin/bash -e {0}
  env:
    DATABRICKS_HOST: ***
    DATABRICKS_TOKEN: ***
    pythonLocation: /opt/hostedtoolcache/Python/3.7.5/x64
    LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.7.5/x64/lib
[dbx][2021-05-15 09:24:53.764] Launching job databricks-pipelines-sample-integration-test on environment default
[dbx][2021-05-15 09:24:53.765] Using configuration from the environment variables
[dbx][2021-05-15 09:24:55.251] No additional tags provided
[dbx][2021-05-15 09:24:55.254] Successfully found deployment per given job name
[dbx][2021-05-15 09:24:56.436] Launching job via run submit API
[dbx][2021-05-15 09:24:56.943] Run URL: ***#job/2703892/run/1
[dbx][2021-05-15 09:24:56.943] Tracing run with id 3032139
[dbx][2021-05-15 09:25:02.037] [Run Id: 3032139] Current run status info - result state: None, lifecycle state: PENDING, state message: Installing libraries
[dbx][2021-05-15 09:25:07.129] [Run Id: 3032139] Current run status info - result state: None, lifecycle state: PENDING, state message: Installing libraries
[dbx][2021-05-15 09:25:12.227] [Run Id: 3032139] Current run status info - result state: None, lifecycle state: RUNNING, state message: In run
[dbx][2021-05-15 09:25:17.325] [Run Id: 3032139] Current run status info - result state: None, lifecycle state: RUNNING, state message: In run
[dbx][2021-05-15 09:25:22.417] [Run Id: 3032139] Current run status info - result state: None, lifecycle state: RUNNING, state message: In run
[dbx][2021-05-15 09:25:27.509] [Run Id: 3032139] Current run status info - result state: FAILED, lifecycle state: INTERNAL_ERROR, state message: Cannot read the python file dbfs:/Shared/dbx/projects/databricks_pipelines/2dc5616b50a943dc96e014e06174abda/artifacts/tests/integration/sample_test.py. Please check driver logs for more details.
[dbx][2021-05-15 09:25:27.510] Finished tracing run with id 3032139
renardeinside commented 3 years ago

hi @kelvin1794 , thanks a lot for raising the issue. Unfortunately, I cannot reproduce it in the dev environment, so I need more info for debugging. Could you please show the content of this folder via CLI command:

databricks fs ls dbfs:/Shared/dbx/projects/databricks_pipelines/2dc5616b50a943dc96e014e06174abda/artifacts

and this one:

databricks fs ls dbfs:/Shared/dbx/projects/databricks_pipelines/2dc5616b50a943dc96e014e06174abda/artifacts/tests/integration

?

pebabion commented 3 years ago

Hi @renardeinside , thank you so much for helping.

When I run

databricks fs ls dbfs:/Shared/dbx/projects/databricks_pipelines/2dc5616b50a943dc96e014e06174abda/artifacts

, this is the result:

.dbx
dist
tests

And when I run

databricks fs ls dbfs:/Shared/dbx/projects/databricks_pipelines/2dc5616b50a943dc96e014e06174abda/artifacts/tests/integration

, this is the result:

sample_test.py

Also, just a side note, when I installed the wheel file in the cluster, there seems to be a conflict, because when I run a notebook, all the cell results become "Cancelled".

renardeinside commented 3 years ago

@kelvin1794 , could you please try the following command:

databricks fs cat dbfs:/Shared/dbx/projects/databricks_pipelines/2dc5616b50a943dc96e014e06174abda/artifacts/tests/integration/sample_test.py

I have a guess that for some strange reason your user has no permission to read the file. Do you see the file content in the output?

pebabion commented 3 years ago

@renardeinside , thanks for the prompt reply.

When I run it, the output is the content of the sample_test.py file. Is there any possible reason you could think of? Is it some extra configuration I need to do with dbx or something?

import unittest

from amgen_databricks_pipelines.jobs.sample.entrypoint import SampleJob
from uuid import uuid4
from pyspark.dbutils import DBUtils  # noqa

class SampleJobIntegrationTest(unittest.TestCase):
    def setUp(self):

        self.test_dir = "dbfs:/tmp/tests/sample/%s" % str(uuid4())
        self.test_config = {"output_format": "delta", "output_path": self.test_dir}

        self.job = SampleJob(init_conf=self.test_config)
        self.dbutils = DBUtils(self.job.spark)
        self.spark = self.job.spark

    def test_sample(self):

        self.job.launch()

        output_count = (
            self.spark.read.format(self.test_config["output_format"])
            .load(self.test_config["output_path"])
            .count()
        )

        self.assertGreater(output_count, 0)

    def tearDown(self):
        self.dbutils.fs.rm(self.test_dir, True)

if __name__ == "__main__":
    # please don't change the logic of test result checks here
    # it's intentionally done in this way to comply with jobs run result checks
    # for other tests, please simply replace the SampleJobIntegrationTest with your custom class name
    loader = unittest.TestLoader()
    tests = loader.loadTestsFromTestCase(SampleJobIntegrationTest)
    runner = unittest.TextTestRunner(verbosity=2)
    result = runner.run(tests)
    if not result.wasSuccessful():
        raise RuntimeError(
            "One or multiple tests failed. Please check job logs for additional information."
        )
renardeinside commented 3 years ago

I'm still not sure about the root cause tbh. Could you please take a look at the run logs for run ***#job/2703892/run/1? Error log and log4j sections might provide some more clues.

pebabion commented 3 years ago

Thanks @renardeinside . I guess I'll have to dig in, and will update if I can find the answer.

pebabion commented 3 years ago

@renardeinside thank you. When I tested on a different Databricks instance, it worked fine. So I guess it was something to do with the previous Databricks environment. Thank you!