Open oliverrmaa opened 1 week ago
To resolve the intermittent FileNotFoundError: [Errno 2] No such file or directory for /tmp files
that is causing task failures for dbt Cosmos tasks in your Astronomer deployment, you should ensure that the temporary directories and files are properly managed and exist when needed.
Here are some steps to address this issue:
Manage Temporary Directories Properly: Ensure that the temporary directories are not cleaned up prematurely. Here is a revised version of the function to ensure the temporary directory is properly managed:
import tempfile
import shutil
from pathlib import Path
def failing_test_dbt_project(tmp_path):
with tempfile.TemporaryDirectory() as tmp_dir:
tmp_dir_path = Path(tmp_dir) / "mini"
shutil.copytree(MINI_DBT_PROJ_DIR, tmp_dir_path)
target_schema = tmp_dir_path / "models/schema.yml"
if target_schema.exists():
os.remove(target_schema)
shutil.copy(MINI_DBT_PROJ_DIR_FAILING_SCHEMA, target_schema)
yield tmp_dir_path
Set the TMPDIR
Environment Variable:
Ensure that the TMPDIR
environment variable is set correctly, especially on MacOS, to avoid the value of the temp directory changing unexpectedly. You can configure this in your cosmos/settings.py
:
import os
import tempfile
from pathlib import Path
import airflow
from airflow.configuration import conf
from cosmos.constants import DEFAULT_COSMOS_CACHE_DIR_NAME, DEFAULT_OPENLINEAGE_NAMESPACE
# In MacOS users may want to set the envvar `TMPDIR` if they do not want the value of the temp directory to change
DEFAULT_CACHE_DIR = Path(tempfile.gettempdir(), DEFAULT_COSMOS_CACHE_DIR_NAME)
cache_dir = Path(conf.get("cosmos", "cache_dir", fallback=DEFAULT_CACHE_DIR) or DEFAULT_CACHE_DIR)
enable_cache = conf.getboolean("cosmos", "enable_cache", fallback=True)
enable_cache_partial_parse = conf.getboolean("cosmos", "enable_cache_partial_parse", fallback=True)
enable_cache_dbt_ls = conf.getboolean("cosmos", "enable_cache_dbt_ls", fallback=True)
propagate_logs = conf.getboolean("cosmos", "propagate_logs", fallback=True)
dbt_docs_dir = conf.get("cosmos", "dbt_docs_dir", fallback=None)
dbt_docs_conn_id = conf.get("cosmos", "dbt_docs_conn_id", fallback=None)
dbt_docs_index_file_name = conf.get("cosmos", "dbt_docs_index_file_name", fallback="index.html")
enable_cache_profile = conf.getboolean("cosmos", "enable_cache_profile", fallback=True)
dbt_profile_cache_dir_name = conf.get("cosmos", "profile_cache_dir_name", fallback="profile")
try:
LINEAGE_NAMESPACE = conf.get("openlineage", "namespace")
except airflow.exceptions.AirflowConfigException:
LINEAGE_NAMESPACE = os.getenv("OPENLINEAGE_NAMESPACE", DEFAULT_OPENLINEAGE_NAMESPACE)
Handle Temporary Directory in run_command
Function:
Ensure that the run_command
function in cosmos/dbt/graph.py
is correctly handling the temporary directory and environment variables:
def run_command(command: list[str], tmp_dir: Path, env_vars: dict[str, str]) -> str:
"""Run a command in a subprocess, returning the stdout."""
logger.info("Running command: `%s`", " ".join(command))
logger.debug("Environment variable keys: %s", env_vars.keys())
process = Popen(
command,
stdout=PIPE,
stderr=PIPE,
cwd=tmp_dir,
universal_newlines=True,
env=env_vars,
)
stdout, stderr = process.communicate()
returncode = process.returncode
if 'Run "dbt deps" to install package dependencies' in stdout and command[1] == "ls":
raise CosmosLoadDbtException(
"Unable to run dbt ls command due to missing dbt_packages. Set RenderConfig.dbt_deps=True."
)
if returncode or "Error" in stdout.replace("WarnErrorOptions", ""):
details = stderr or stdout
raise CosmosLoadDbtException(f"Unable to run {command} due to the error:\n{details}")
return stdout
By ensuring the temporary directory is properly managed, configuring the TMPDIR
environment variable, and correctly handling the temporary directory in the run_command
function, you can mitigate the FileNotFoundError
issue [1][2][3].
To continue talking to Dosu, mention @dosu.
Astronomer Cosmos Version
Other Astronomer Cosmos version (please specify below)
If "Other Astronomer Cosmos version" selected, which one?
1.4.3
dbt-core version
1.7.17
Versions of dbt adapters
dbt-bigquery==1.7.4 dbt-core==1.7.17 dbt-extractor==0.5.1 dbt-semantic-interfaces==0.4.4
LoadMode
DBT_LS
ExecutionMode
LOCAL
InvocationMode
SUBPROCESS
airflow version
apache-airflow==2.9.2+astro.1
Operating System
Debian GNU/Linux 11 (bullseye)
If a you think it's an UI issue, what browsers are you seeing the problem on?
No response
Deployment
Astronomer
Deployment details
We have a main production deployment in Astro Cloud which we consider as production. We also do local development via
astro dev start
. We have continuous deployment set up through CircleCI which deploys merged PRs to our master branch to our production deployment viaastro deploy --dags
. For authentication to our data warehouse (Google BigQuery) in production, we useGoogleCloudServiceAccountDictProfileMapping
and for local we useProfileConfig
where our dbt profiles.yml has a hardcoded path to a service account JSON file which is at the same path for each developer.What happened?
We are still intermittently seeing
FileNotFoundError: [Errno 2] No such file or directory for /tmp files
every few hours or so across multiple DAGs ever since the inception of our Astronomer/Cosmos setup. This error appears on Cosmos created dbt model run tasks. This issue affects our on-call personnel because they have to manually clear and re-run these tasks in order for the model to successfully run (the re-run usually succeeds). Some model runs must be manual re-run in order for this task to succeed and some may recover on their own.Relevant log output
How to reproduce
I believe having our setup should reproduce it as the issue happens quite often (once every few hours). We also have 1000+ dbt models so an example project of that size may be needed for reproducibility. If underlying deployment infrastructure information is required here it is:
Anything else :)?
No response
Are you willing to submit PR?
Contact Details
oma@forhims.com