neptune-ai / kedro-neptune

📌 Track & manage metadata, visualize & compare Kedro pipelines in a nice UI.
https://docs.neptune.ai/integrations-and-supported-tools/automation-pipelines/kedro
Apache License 2.0
18 stars 4 forks source link

BUG: `OSError: [Errno 63] File name too long` on versions>=0.1.4 #67

Closed lauraturnbull closed 1 year ago

lauraturnbull commented 1 year ago

Describe the bug

On versions >=0.1.4 we're getting an OSError during the before_pipeline_run() in kedro_neptune/__init__.py. With the same input and running from the same location there is no issue in versions <=0.1.3.

Reproduction

In the kedro pipeline we actually have:

neptune:
  project: $NEPTUNE_PROJECT
  base_namespace: kedro
  enabled: false

but the pre-validation still runs.

The pipeline is being invoked like: kedro run --pipeline=backtesting --runner=ThreadRunner --params=<362 chars of params overriding defaults>

Traceback

│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/bin/kedro:8 in   │
│ <module>                                                                                         │
│                                                                                                  │
│   5 from kedro.framework.cli import main                                                         │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/kedro/framework/cli/cli.py:211 in main                                               │
│                                                                                                  │
│   208 │   """                                                                                    │
│   209 │   _init_plugins()                                                                        │
│   210 │   cli_collection = KedroCLI(project_path=Path.cwd())                                     │
│ ❱ 211 │   cli_collection()                                                                       │
│   212                                                                                            │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/click/core.py:1130 in __call__                                                       │
│                                                                                                  │
│   1127 │                                                                                         │
│   1128 │   def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                           │
│   1129 │   │   """Alias for :meth:`main`."""                                                     │
│ ❱ 1130 │   │   return self.main(*args, **kwargs)                                                 │
│   1131                                                                                           │
│   1132                                                                                           │
│   1133 class Command(BaseCommand):                                                               │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/kedro/framework/cli/cli.py:139 in main                                               │
│                                                                                                  │
│   136 │   │   )                                                                                  │
│   137 │   │                                                                                      │
│   138 │   │   try:                                                                               │
│ ❱ 139 │   │   │   super().main(                                                                  │
│   140 │   │   │   │   args=args,                                                                 │
│   141 │   │   │   │   prog_name=prog_name,                                                       │
│   142 │   │   │   │   complete_var=complete_var,                                                 │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/click/core.py:1055 in main                                                           │
│                                                                                                  │
│   1052 │   │   try:                                                                              │
│   1053 │   │   │   try:                                                                          │
│   1054 │   │   │   │   with self.make_context(prog_name, args, **extra) as ctx:                  │
│ ❱ 1055 │   │   │   │   │   rv = self.invoke(ctx)                                                 │
│   1056 │   │   │   │   │   if not standalone_mode:                                               │
│   1057 │   │   │   │   │   │   return rv                                                         │
│   1058 │   │   │   │   │   # it's not safe to `ctx.exit(rv)` here!                               │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/click/core.py:1657 in invoke                                                         │
│                                                                                                  │
│   1654 │   │   │   │   super().invoke(ctx)                                                       │
│   1655 │   │   │   │   sub_ctx = cmd.make_context(cmd_name, args, parent=ctx)                    │
│   1656 │   │   │   │   with sub_ctx:                                                             │
│ ❱ 1657 │   │   │   │   │   return _process_result(sub_ctx.command.invoke(sub_ctx))               │
│   1658 │   │                                                                                     │
│   1659 │   │   # In chain mode we create the contexts step by step, but after the                │
│   1660 │   │   # base command has been invoked.  Because at that point we do not                 │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/click/core.py:1404 in invoke                                                         │
│                                                                                                  │
│   1401 │   │   │   echo(style(message, fg="red"), err=True)                                      │
│   1402 │   │                                                                                     │
│   1403 │   │   if self.callback is not None:                                                     │
│ ❱ 1404 │   │   │   return ctx.invoke(self.callback, **ctx.params)                                │
│   1405 │                                                                                         │
│   1406 │   def shell_complete(self, ctx: Context, incomplete: str) -> t.List["CompletionItem"]:  │
│   1407 │   │   """Return a list of completions for the incomplete value. Looks                   │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/click/core.py:760 in invoke                                                          │
│                                                                                                  │
│    757 │   │                                                                                     │
│    758 │   │   with augment_usage_errors(__self):                                                │
│    759 │   │   │   with ctx:                                                                     │
│ ❱  760 │   │   │   │   return __callback(*args, **kwargs)                                        │
│    761 │                                                                                         │
│    762 │   def forward(                                                                          │
│    763 │   │   __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any  # noqa: B902             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/kedro/framework/cli/project.py:352 in run                                            │
│                                                                                                  │
│   349 │   node_names = _get_values_as_tuple(node_names) if node_names else node_names            │
│   350 │                                                                                          │
│   351 │   with KedroSession.create(env=env, extra_params=params) as session:                     │
│ ❱ 352 │   │   session.run(                                                                       │
│   353 │   │   │   tags=tag,                                                                      │
│   354 │   │   │   runner=runner(is_async=is_async),                                              │
│   355 │   │   │   node_names=node_names,                                                         │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/kedro/framework/session/session.py:397 in run                                        │
│                                                                                                  │
│   394 │   │   # Run the runner                                                                   │
│   395 │   │   hook_manager = self._hook_manager                                                  │
│   396 │   │   runner = runner or SequentialRunner()                                              │
│ ❱ 397 │   │   hook_manager.hook.before_pipeline_run(  # pylint: disable=no-member                │
│   398 │   │   │   run_params=record_data, pipeline=filtered_pipeline, catalog=catalog            │
│   399 │   │   )                                                                                  │
│   400                                                                                            │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/pluggy/_hooks.py:265 in __call__                                                     │
│                                                                                                  │
│   262 │   │   else:                                                                              │
│   263 │   │   │   firstresult = False                                                            │
│   264 │   │                                                                                      │
│ ❱ 265 │   │   return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)        │
│   266 │                                                                                          │
│   267 │   def call_historic(self, result_callback=None, kwargs=None):                            │
│   268 │   │   """Call the hook with given ``kwargs`` for all registered plugins and              │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/pluggy/_manager.py:80 in _hookexec                                                   │
│                                                                                                  │
│    77 │   def _hookexec(self, hook_name, methods, kwargs, firstresult):                          │
│    78 │   │   # called from all hookcaller instances.                                            │
│    79 │   │   # enable_tracing will set its own wrapping function at self._inner_hookexec        │
│ ❱  80 │   │   return self._inner_hookexec(hook_name, methods, kwargs, firstresult)               │
│    81 │                                                                                          │
│    82 │   def register(self, plugin, name=None):                                                 │
│    83 │   │   """Register a plugin and return its canonical name or ``None`` if the name         │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/pluggy/_callers.py:60 in _multicall                                                  │
│                                                                                                  │
│   57 │   │   │   except StopIteration:                                                           │
│   58 │   │   │   │   pass                                                                        │
│   59 │   │                                                                                       │
│ ❱ 60 │   │   return outcome.get_result()                                                         │
│   61                                                                                             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/pluggy/_result.py:60 in get_result                                                   │
│                                                                                                  │
│   57 │   │   │   return self._result                                                             │
│   58 │   │   else:                                                                               │
│   59 │   │   │   ex = self._excinfo                                                              │
│ ❱ 60 │   │   │   raise ex[1].with_traceback(ex[2])                                               │
│   61                                                                                             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/pluggy/_callers.py:39 in _multicall                                                  │
│                                                                                                  │
│   36 │   │   │   │   │   except StopIteration:                                                   │
│   37 │   │   │   │   │   │   _raise_wrapfail(gen, "did not yield")                               │
│   38 │   │   │   │   else:                                                                       │
│ ❱ 39 │   │   │   │   │   res = hook_impl.function(*args)                                         │
│   40 │   │   │   │   │   if res is not None:                                                     │
│   41 │   │   │   │   │   │   results.append(res)                                                 │
│   42 │   │   │   │   │   │   if firstresult:  # halt further impl calls                          │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/kedro_neptune/__init__.py:436 in before_pipeline_run                                 │
│                                                                                                  │
│   433 │   ) -> None:                                                                             │
│   434 │   │   config = get_neptune_config(settings)                                              │
│   435 │   │                                                                                      │
│ ❱ 436 │   │   run = neptune.init(api_token=config.api_token,                                     │
│   437 │   │   │   │   │   │      project=config.project,                                         │
│   438 │   │   │   │   │   │      mode=_connection_mode(config.enabled),                          │
│   439 │   │   │   │   │   │      custom_run_id=self._run_id,                                     │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/new/internal/utils/deprecation.py:37 in inner                                │
│                                                                                                  │
│   34 │   │   │   │   " For details, see https://docs.neptune.ai/setup/neptune-client_1-0_rele    │
│   35 │   │   │   )                                                                               │
│   36 │   │   │                                                                                   │
│ ❱ 37 │   │   │   return func(*args, **kwargs)                                                    │
│   38 │   │                                                                                       │
│   39 │   │   return inner                                                                        │
│   40                                                                                             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/new/internal/init/__init__.py:43 in init                                     │
│                                                                                                  │
│   40                                                                                             │
│   41 @deprecated(alternative="init_run")                                                         │
│   42 def init(*args, **kwargs):                                                                  │
│ ❱ 43 │   return init_run(*args, **kwargs)                                                        │
│   44                                                                                             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/new/internal/utils/deprecation.py:60 in inner                                │
│                                                                                                  │
│   57 │   │   │   │   kwargs[required_kwarg_name] = kwargs[deprecated_kwarg_name]                 │
│   58 │   │   │   │   del kwargs[deprecated_kwarg_name]                                           │
│   59 │   │   │                                                                                   │
│ ❱ 60 │   │   │   return f(*args, **kwargs)                                                       │
│   61 │   │                                                                                       │
│   62 │   │   return inner                                                                        │
│   63                                                                                             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/new/internal/init/run.py:268 in init_run                                     │
│                                                                                                  │
│   265 │   else:                                                                                  │
│   266 │   │   if mode == Mode.READ_ONLY:                                                         │
│   267 │   │   │   raise NeedExistingRunForReadOnlyMode()                                         │
│ ❱ 268 │   │   git_ref = get_git_info(discover_git_repo_location())                               │
│   269 │   │   if custom_run_id_exceeds_length(custom_run_id):                                    │
│   270 │   │   │   custom_run_id = None                                                           │
│   271                                                                                            │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/new/internal/utils/git.py:78 in discover_git_repo_location                   │
│                                                                                                  │
│   75                                                                                             │
│   76                                                                                             │
│   77 def discover_git_repo_location() -> Optional[str]:                                          │
│ ❱ 78 │   potential_initial_path = os.path.dirname(os.path.abspath(get_path_executed_script())    │
│   79 │   return get_git_repo_path(initial_path=potential_initial_path)                           │
│   80                                                                                             │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/vendor/lib_programname.py:83 in get_path_executed_script                     │
│                                                                                                  │
│    80 │   │   return path_candidate                                                              │
│    81 │                                                                                          │
│    82 │   # try to get it from sys_argv - does not work when loaded from uwsgi, works in eclip   │
│ ❱  83 │   path_candidate = get_fullpath_from_sys_argv()                                          │
│    84 │   if path_candidate != empty_path:                                                       │
│    85 │   │   return path_candidate                                                              │
│    86                                                                                            │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/vendor/lib_programname.py:129 in get_fullpath_from_sys_argv                  │
│                                                                                                  │
│   126 │   """                                                                                    │
│   127 │                                                                                          │
│   128 │   for arg_string in sys.argv:                                                            │
│ ❱ 129 │   │   valid_executable_path = get_valid_executable_path_or_empty_path(arg_string)        │
│   130 │   │   if valid_executable_path != empty_path:                                            │
│   131 │   │   │   return valid_executable_path                                                   │
│   132 │   return empty_path                                                                      │
│                                                                                                  │
│ /Users/laura/Library/Caches/pypoetry/virtualenvs/ds-optimisation-Vv0Dsqfe-py3.8/lib/python3.8/si │
│ te-packages/neptune/vendor/lib_programname.py:139 in get_valid_executable_path_or_empty_path     │
│                                                                                                  │
│   136 │   arg_string = remove_doctest_and_docrunner_parameters(arg_string)                       │
│   137 │   arg_string = add_python_extension_if_not_there(arg_string)                             │
│   138 │   path = pathlib.Path(arg_string)                                                        │
│ ❱ 139 │   if path.is_file():                                                                     │
│   140 │   │   path = path.resolve()  # .resolve does not work on a non existing file in python   │
│   141 │   │   return path                                                                        │
│   142 │   else:                                                                                  │
│                                                                                                  │
│ /Users/laura/.pyenv/versions/3.8.12/lib/python3.8/pathlib.py:1439 in is_file                     │
│                                                                                                  │
│   1436 │   │   to regular files).                                                                │
│   1437 │   │   """                                                                               │
│   1438 │   │   try:                                                                              │
│ ❱ 1439 │   │   │   return S_ISREG(self.stat().st_mode)                                           │
│   1440 │   │   except OSError as e:                                                              │
│   1441 │   │   │   if not _ignore_error(e):                                                      │
│   1442 │   │   │   │   raise                                                                     │
│                                                                                                  │
│ /Users/laura/.pyenv/versions/3.8.12/lib/python3.8/pathlib.py:1198 in stat                        │
│                                                                                                  │
│   1195 │   │   Return the result of the stat() system call on this path, like                    │
│   1196 │   │   os.stat() does.                                                                   │
│   1197 │   │   """                                                                               │
│ ❱ 1198 │   │   return self._accessor.stat(self)                                                  │
│   1199 │                                                                                         │
│   1200 │   def owner(self):                                                                      │
│   1201 │   │   """                                                                               │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: [Errno 63] File name too long:

Environment

Python: 3.8.12
Kedro: 0.18.2
kedro-neptune: 0.1.6
neptune-client: 0.16.18
SiddhantSadangi commented 1 year ago

Hello @lauraturnbull !

I cannot reproduce the issue, however, IMO Neptune prevalidations should not run if enabled=False. I am checking with the devs to confirm if this is indeed a bug.

Meanwhile, will it be possible for you to see if this issue still exists on updating neptune-client to v1.2.0?

lauraturnbull commented 1 year ago

Hi @SiddhantSadangi Thanks for the quick response! Unfortunately the issue still happens with v1.2.0. Re neptune prevalidations - enabled: false has always still run some kind of neptune behaviour. For example with enabled: false neptune still looks for the credentials_neptune.yml file and validates $NEPTUNE_API_TOKEN. I think the ideal endpoint would be ensuring that nothing neptune related runs when it isn't enabled. Let me know what the devs think :)

Thanks!

SiddhantSadangi commented 1 year ago

I think the ideal endpoint would be ensuring that nothing neptune related runs when it isn't enabled

I second this ✅

Regarding the bug, I don't see any change introduced by kedro-neptune v0.1.4 that could have led to this. On examining the traceback, it seems to originate from the os.stat() method, which comes preinstalled with python.

Can you confirm if there was no change in the environment other than updating kedro-neptune? One way to test would be downgrading kedro-neptune to 0.1.3 and seeing if it still runs without any issues.

SiddhantSadangi commented 1 year ago

Hello @lauraturnbull , Just checking if you are still facing this issue.

SiddhantSadangi commented 1 year ago

Hello @lauraturnbull ,

kedro-neptune release 0.2.0 fixes the issue of neptune prevalidations running even if enabled:false.

I am closing this issue for now, but please feel free to reopen this in case you are still facing issues.