Closed oj-m closed 2 years ago
Hi @oj-m and thanks for the bug report. Is there anything else in the error message at all? It's not a very helpful message unfortunately but I suspect this is something to do with git rather than kedro itself. Is the directory you're running kedro in version controlled?
Kedro Docs contains a pandas Iris example project which has Python 3.6 in the requirements file
Please could you point out where the Python 3.6 requirement is? I'm a bit surprised to hear this. Kedro 0.17.7 should indeed work with 3.6 , 3.7 and 3.8.
I would also add - does this work if you do not init the repository? That error is from git
not kedro? This stackoverflow post may help as Homebrew users are reporting similar issues.
Hi @oj-m, I tried to replicate the error on my M1 machine but couldn't get the error. This error might be related to some other package instead
Ok, cleaned everything up and skipped git init
, and received missing jupyter_client
dependency:
$ kedro new --starter=pandas-iris
2022-04-02 16:05:04,377 - kedro.framework.cli.hooks.manager - INFO - Registered CLI hooks from 1 installed plugin(s): kedro-telemetry-0.1.4
Kedro-Telemetry is installed, but you have opted out of sharing usage analytics so none will be collected.
Project Name:
=============
Please enter a human readable name for your new project.
Spaces and punctuation are allowed.
[New Kedro Project]:
Repository Name:
================
Please enter a directory name for your new project repository.
Alphanumeric characters, hyphens and underscores are allowed.
Lowercase is recommended.
[new-kedro-project]:
Python Package Name:
====================
Please enter a valid Python package name for your project package.
Alphanumeric characters and underscores are allowed.
Lowercase is recommended. Package name must start with a letter
or underscore.
[new_kedro_project]:
Change directory to the project generated in /Users/oj-m/Documents/new-kedro-project
A best-practice setup includes initialising git and creating a virtual environment before running ``kedro install`` to install project-specific dependencies. Refer to the Kedro documentation: https://kedro.readthedocs.io/
$ cd new-kedro-project
$ kedro install
2022-04-02 16:06:39,161 - kedro.framework.cli.hooks.manager - INFO - Registered CLI hooks from 1 installed plugin(s): kedro-telemetry-0.1.4
As an open-source project, we collect usage analytics.
We cannot see nor store information contained in a Kedro project.
You can find out more by reading our privacy notice:
https://github.com/kedro-org/kedro-plugins/tree/main/kedro-telemetry#privacy-notice
Do you opt into usage analytics? [y/N]:
Kedro-Telemetry is installed, but you have opted out of sharing usage analytics so none will be collected.
DeprecationWarning: Command `kedro install` will be deprecated in Kedro 0.18.0. In the future use `pip install -r src/requirements.txt` instead. If you were running `kedro install` with the `--build-reqs` flag, we recommend running `kedro build-reqs` followed by `pip install -r src/requirements.txt`
No requirements.in found. Copying contents from requirements.txt...
/Users/oj-m/.pyenv/versions/3.8.13/envs/kedro/bin/python3.8 -m piptools compile -q /Users/oj-m/Documents/new-kedro-project/src/requirements.in
Could not find a version that matches jupyter_client<7.0,>=4.1,>=5.1,>=5.3.4,>=6.1.12,>=7.0.0 (from -r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 7))
Tried: 4.0.0, 4.0.0, 4.0.0, 4.1.0, 4.1.0, 4.1.1, 4.1.1, 4.1.1, 4.2.0, 4.2.0, 4.2.0, 4.2.1, 4.2.1, 4.2.1, 4.2.2, 4.2.2, 4.2.2, 4.3.0, 4.3.0, 4.3.0, 4.4.0, 4.4.0, 5.0.0, 5.0.0, 5.0.1, 5.0.1, 5.1.0, 5.1.0, 5.2.0, 5.2.0, 5.2.1, 5.2.1, 5.2.2, 5.2.2, 5.2.3, 5.2.3, 5.2.4, 5.2.4, 5.3.0, 5.3.0, 5.3.1, 5.3.1, 5.3.2, 5.3.2, 5.3.3, 5.3.3, 5.3.4, 5.3.4, 5.3.5, 5.3.5, 6.0.0, 6.0.0, 6.1.0, 6.1.0, 6.1.1, 6.1.1, 6.1.2, 6.1.2, 6.1.3, 6.1.3, 6.1.5, 6.1.5, 6.1.6, 6.1.6, 6.1.7, 6.1.7, 6.1.8, 6.1.8, 6.1.9, 6.1.9, 6.1.10, 6.1.10, 6.1.11, 6.1.11, 6.1.12, 6.1.12, 6.1.13, 6.1.13, 6.2.0, 6.2.0, 7.0.0, 7.0.0, 7.0.1, 7.0.1, 7.0.2, 7.0.2, 7.0.3, 7.0.3, 7.0.4, 7.0.4, 7.0.5, 7.0.5, 7.0.6, 7.0.6, 7.1.0, 7.1.0, 7.1.1, 7.1.1, 7.1.2, 7.1.2, 7.2.0, 7.2.0, 7.2.1, 7.2.1
Skipped pre-versions: 7.0.0a0, 7.0.0a0, 7.0.0a1, 7.0.0a1, 7.0.0rc0, 7.0.0rc0, 7.0.0rc1, 7.0.0rc1
There are incompatible versions in the resolved dependencies:
jupyter_client<7.0,>=5.1 (from -r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 7))
jupyter-client>=5.3.4 (from notebook==6.4.10->jupyter==1.0.0->-r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 6))
jupyter-client>=6.1.12 (from ipykernel==6.11.0->jupyter==1.0.0->-r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 6))
jupyter-client>=7.0.0 (from jupyter-console==6.4.3->jupyter==1.0.0->-r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 6))
jupyter-client<7.0,>=5.1 (from kedro[pandas.csvdataset]==0.17.7->-r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 9))
jupyter-client>=6.1.12 (from jupyter-server==1.16.0->jupyterlab==3.3.2->-r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 8))
jupyter-client>=4.1 (from qtconsole==5.3.0->jupyter==1.0.0->-r /Users/oj-m/Documents/new-kedro-project/src/requirements.in (line 6))
That is a fair error that got fixed in 0.18. As a workaround, add jupyter-console<6.4.3 # 6.4.3 requires jupyter_client>=7.0
as mentioned in 1356.
And on your earlier error, would be great to hear if you were able to find a fix. If not which version of git are you using?
Updating the requirements.txt
file with the versions as described above resolved the kedro install
step, thanks.
However, the Git issues persist. My installed versions:
Executing kedro run
prior to running git init
results in:
2022-04-02 17:14:41,321 - kedro.framework.cli.hooks.manager - INFO - Registered CLI hooks from 1 installed plugin(s): kedro-telemetry-0.1.4
Kedro-Telemetry is installed, but you have opted out of sharing usage analytics so none will be collected.
2022-04-02 17:14:41,361 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
fatal: not a git repository (or any of the parent directories): .git
2022-04-02 17:14:41,371 - kedro.framework.session.session - WARNING - Unable to git describe /Users/oj-m/Documents/pandas-iris
...
Executing kedro run
after running git init
results in:
2022-04-02 17:14:57,575 - kedro.framework.cli.hooks.manager - INFO - Registered CLI hooks from 1 installed plugin(s): kedro-telemetry-0.1.4
Kedro-Telemetry is installed, but you have opted out of sharing usage analytics so none will be collected.
2022-04-02 17:14:57,614 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
fatal: Needed a single revision
2022-04-02 17:14:57,627 - kedro.framework.session.session - WARNING - Unable to git describe /Users/oj-m/Documents/pandas-iris
...
Well, looks like it requires a first commit, not just an init...
git add * && git commit -m "First"
kedro run
The rest seems to be working. Thanks.
There is one note at the end, however, that doesn't match the expectations from the docs:
...
2022-04-02 17:34:32,058 - kedro.runner.sequential_runner - INFO - Completed 4 out of 4 tasks
2022-04-02 17:34:32,058 - kedro.runner.sequential_runner - INFO - Pipeline execution completed successfully.
2022-04-02 17:34:32,058 - kedro.framework.session.store - INFO - `save()` not implemented for `BaseSessionStore`. Skipping the step.
Assuming its benign?
Yes, that is only an info log. So no trouble with that.
Good that you have a workaround for the git error. I'm able to run the pipeline without any error, so I can't replicate the error to help :(
Some users still experiencing issues - still investigating
Adding more information
import subprocess
import logging
from typing import Any, Dict, Iterable, Union
from pathlib import Path
def _describe_git(project_path: Path) -> Dict[str, Dict[str, Any]]: project_path = str(project_path) try: res = subprocess.check_output( ["git", "rev-parse", "--short", "HEAD"], cwd=project_path )
subprocess.check_output()
raises NotADirectoryError
on Windowsexcept (subprocess.CalledProcessError, FileNotFoundError, NotADirectoryError):
logging.getLogger(__name__).warning("Unable to git describe %s", project_path)
return {}
git_data = {"commit_sha": res.decode().strip()} # type: Dict[str, Any]
res = subprocess.check_output(["git", "status", "--short"], cwd=project_path)
git_data["dirty"] = bool(res.decode().strip())
return {"git": git_data}
project_path = Path.cwd()
_describe_git(project_path)
Okay so I understand this better now - the exception handler does successfully catch the error on the python side, but the subprocess will still cause the quite scary error message to be presented:
We could pre-check this by doing a couple of things:
.git
in foldersubprocess.check_output
we could redirect the stderr
to stdout
like shown in this example.@datajoely That's what initially gave me pause, as I didn't catch that the fatal
error was actually a benign INFO log. Without diving into the codebase, it just wasn't immediately clear the error was bubbling messaging up from an expected git state.
Just to understand where we stand on this... This isn't actually related to Apple M1 chips at all, right? It's just what happens if you do kedro run
in a directory which hasn't had git commit yet?
It's not - I've changed the description
Fix in review https://github.com/kedro-org/kedro/pull/1422/files
Description
Kedro Docs contains a pandas Iris example project which has Python 3.6 in the requirements file, which does not execute on newer Apple M1 chipsets. Attempting to execute it on an M1-compatible version of Python 3.8.13 via
kedro run
results in:Context
Attempt the docs instructions on https://kedro.readthedocs.io/en/stable/get_started/example_project.html on an Apple M1.
Steps to Reproduce
Execute the following on an Apple M1:
Expected Result
Run without
fatal
errors.Actual Result
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
0.17.7
python -V
):3.8.13
macOS 12.3