kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.53k stars 877 forks source link

Sagemaker notebooks raise error for `pandas.CSVDataSet` #308

Closed tjcuddihy closed 4 years ago

tjcuddihy commented 4 years ago

Description

The conda environment for python3.6 in notebooks cannot find pandas.CSVDataSet

Context

I'm wanting to use sagemaker as my development environment. However, I cannot get kedro to run as expected in both the notebooks (for exploration and node development) and the terminal (for running pipelines).

Steps to Reproduce

  1. Startup a Sagemaker instance with defaults

Terminal success:

  1. pip install kedro in the terminal
  2. kedro new 2a. testing for name 2b. y for example project
  3. cd testing; kedro run => Success!

Notebook fail:

  1. Create a new conda_python3 notebook in testing/notebooks/
  2. !pip install kedro in a notebook

    The environments for the terminal and notebooks are separate by design in Sagemaker

  3. Load the kedro context as described here

    Note that I've started to use the code below; Without checking if current_dir exists, you need to restart the kernel if you want to reload the context as something in the last 2 lines of code causes the next invocation of Path.cwd() to point to the root dir not notebook/, as intended.

    if "current_dir" not in locals():
    # Check it exists first. For some reason this is not an idempotent operation?
    current_dir = Path.cwd()  # this points to 'notebooks/' folder
    proj_path = current_dir.parent  # point back to the root of the project
    context = load_context(proj_path)
  4. Run context.catalog.list()

Expected Result

The notebook should print:

['example_iris_data',
 'parameters',
 'params:example_test_data_ratio',
 'params:example_num_train_iter',
 'params:example_learning_rate']

Actual Result

Class `pandas.CSVDataSet` not found.

Full trace.

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/io/core.py in parse_dataset_definition(config, load_version, save_version)
    416         try:
--> 417             class_obj = next(obj for obj in trials if obj is not None)
    418         except StopIteration:

StopIteration: 

During handling of the above exception, another exception occurred:

DataSetError                              Traceback (most recent call last)
~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/io/core.py in from_config(cls, name, config, load_version, save_version)
    148             class_obj, config = parse_dataset_definition(
--> 149                 config, load_version, save_version
    150             )

~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/io/core.py in parse_dataset_definition(config, load_version, save_version)
    418         except StopIteration:
--> 419             raise DataSetError("Class `{}` not found.".format(class_obj))
    420 

DataSetError: Class `pandas.CSVDataSet` not found.

During handling of the above exception, another exception occurred:

DataSetError                              Traceback (most recent call last)
<ipython-input-4-5848382c8bb9> in <module>()
----> 1 context.catalog.list()

~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/context/context.py in catalog(self)
    206 
    207         """
--> 208         return self._get_catalog()
    209 
    210     @property

~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/context/context.py in _get_catalog(self, save_version, journal, load_versions)
    243         conf_creds = self._get_config_credentials()
    244         catalog = self._create_catalog(
--> 245             conf_catalog, conf_creds, save_version, journal, load_versions
    246         )
    247         catalog.add_feed_dict(self._get_feed_dict())

~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/context/context.py in _create_catalog(self, conf_catalog, conf_creds, save_version, journal, load_versions)
    267             save_version=save_version,
    268             journal=journal,
--> 269             load_versions=load_versions,
    270         )
    271 

~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/io/data_catalog.py in from_config(cls, catalog, credentials, load_versions, save_version, journal)
    298             ds_config = _resolve_credentials(ds_config, credentials)
    299             data_sets[ds_name] = AbstractDataSet.from_config(
--> 300                 ds_name, ds_config, load_versions.get(ds_name), save_version
    301             )
    302         return cls(data_sets=data_sets, journal=journal)

~/anaconda3/envs/python3/lib/python3.6/site-packages/kedro/io/core.py in from_config(cls, name, config, load_version, save_version)
    152             raise DataSetError(
    153                 "An exception occurred when parsing config "
--> 154                 "for DataSet `{}`:\n{}".format(name, str(ex))
    155             )
    156 

DataSetError: An exception occurred when parsing config for DataSet `example_iris_data`:
Class `pandas.CSVDataSet` not found.

Investigations so far

CSVLocalDataSet

Upon changing the yaml type for iris.csv from pandas.CSVDataSet to CSVLocalDataSet, we get success on both the terminal and the notebook. However, this is not my desired outcome; The transition to using pandas.CSVDataSet makes it easier, for me at least, to use both S3 and local datasets.

pip install kedro output from notebook

Collecting kedro
  Downloading https://files.pythonhosted.org/packages/67/6f/4faaa0e58728a318aeabc490271a636f87f6b9165245ce1d3adc764240cf/kedro-0.15.8-py3-none-any.whl (12.5MB)
    100% |████████████████████████████████| 12.5MB 4.1MB/s eta 0:00:01
Requirement already satisfied: xlsxwriter<2.0,>=1.0.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (1.0.4)
Collecting azure-storage-file<2.0,>=1.1.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/c9/33/6c611563412ffc409b2413ac50e3a063133ea235b86c137759774c77f3ad/azure_storage_file-1.4.0-py2.py3-none-any.whl
Collecting fsspec<1.0,>=0.5.1 (from kedro)
  Downloading https://files.pythonhosted.org/packages/6e/2b/63420d49d5e5f885451429e9e0f40ad1787eed0d32b1aedd6b10f9c2719a/fsspec-0.7.1-py3-none-any.whl (66kB)
    100% |████████████████████████████████| 71kB 33.5MB/s ta 0:00:01
Requirement already satisfied: pandas<1.0,>=0.24.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (0.24.2)
Collecting s3fs<1.0,>=0.3.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/b8/e4/b8fc59248399d2482b39340ec9be4bb2493846ac23641b43115a7e5cd675/s3fs-0.4.2-py3-none-any.whl
Requirement already satisfied: PyYAML<6.0,>=4.2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (5.3.1)
Collecting tables<3.6,>=3.4.4 (from kedro)
  Downloading https://files.pythonhosted.org/packages/87/f7/bb0ec32a3f3dd74143a3108fbf737e6dcfd47f0ffd61b52af7106ab7a38a/tables-3.5.2-cp36-cp36m-manylinux1_x86_64.whl (4.3MB)
    100% |████████████████████████████████| 4.3MB 10.2MB/s ta 0:00:01
Requirement already satisfied: requests<3.0,>=2.20.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (2.20.0)
Collecting toposort<2.0,>=1.5 (from kedro)
  Downloading https://files.pythonhosted.org/packages/e9/8a/321cd8ea5f4a22a06e3ba30ef31ec33bea11a3443eeb1d89807640ee6ed4/toposort-1.5-py2.py3-none-any.whl
Requirement already satisfied: click<8.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (6.7)
Collecting azure-storage-queue<2.0,>=1.1.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/72/94/4db044f1c155b40c5ebc037bfd9d1c24562845692c06798fbe869fe160e6/azure_storage_queue-1.4.0-py2.py3-none-any.whl
Collecting cookiecutter<2.0,>=1.6.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/86/c9/7184edfb0e89abedc37211743d1420810f6b49ae4fa695dfc443c273470d/cookiecutter-1.7.0-py2.py3-none-any.whl (40kB)
    100% |████████████████████████████████| 40kB 24.6MB/s ta 0:00:01
Collecting pandas-gbq<1.0,>=0.12.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/c3/74/126408f6bdb7b2cb1dcb8c6e4bd69a511a7f85792d686d1237d9825e6194/pandas_gbq-0.13.1-py3-none-any.whl
Collecting pip-tools<5.0.0,>=4.0.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/94/8f/59495d651f3ced9b06b69545756a27296861a6edd6c5709fbe1265ed9032/pip_tools-4.5.1-py2.py3-none-any.whl (41kB)
    100% |████████████████████████████████| 51kB 27.5MB/s ta 0:00:01
Collecting azure-storage-blob<2.0,>=1.1.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/25/f4/a307ed89014e9abb5c5cfc8ca7f8f797d12f619f17a6059a6fd4b153b5d0/azure_storage_blob-1.5.0-py2.py3-none-any.whl (75kB)
    100% |████████████████████████████████| 81kB 35.2MB/s ta 0:00:01
Collecting pyarrow<1.0.0,>=0.12.0 (from kedro)
  Downloading https://files.pythonhosted.org/packages/ba/10/93fad5849418eade4a4cd581f8cd27be1bbe51e18968ba1492140c887f3f/pyarrow-0.16.0-cp36-cp36m-manylinux1_x86_64.whl (62.9MB)
    100% |████████████████████████████████| 62.9MB 779kB/s eta 0:00:01    40% |█████████████                   | 25.7MB 56.1MB/s eta 0:00:01
Requirement already satisfied: SQLAlchemy<2.0,>=1.2.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (1.2.11)
Requirement already satisfied: xlrd<2.0,>=1.0.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from kedro) (1.1.0)
Collecting python-json-logger<1.0,>=0.1.9 (from kedro)
  Downloading https://files.pythonhosted.org/packages/80/9d/1c3393a6067716e04e6fcef95104c8426d262b4adaf18d7aa2470eab028d/python-json-logger-0.1.11.tar.gz
Collecting anyconfig<1.0,>=0.9.7 (from kedro)
  Downloading https://files.pythonhosted.org/packages/4c/00/cc525eb0240b6ef196b98300d505114339bbb7ddd68e3155483f1eb32050/anyconfig-0.9.10.tar.gz (103kB)
    100% |████████████████████████████████| 112kB 34.4MB/s ta 0:00:01
Collecting azure-storage-common~=1.4 (from azure-storage-file<2.0,>=1.1.0->kedro)
  Downloading https://files.pythonhosted.org/packages/05/6c/b2285bf3687768dbf61b6bc085b0c1be2893b6e2757a9d023263764177f3/azure_storage_common-1.4.2-py2.py3-none-any.whl (47kB)
    100% |████████████████████████████████| 51kB 25.9MB/s ta 0:00:01
Collecting azure-common>=1.1.5 (from azure-storage-file<2.0,>=1.1.0->kedro)
  Downloading https://files.pythonhosted.org/packages/e5/4d/d000fc3c5af601d00d55750b71da5c231fcb128f42ac95b208ed1091c2c1/azure_common-1.1.25-py2.py3-none-any.whl
Requirement already satisfied: python-dateutil>=2.5.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pandas<1.0,>=0.24.0->kedro) (2.7.3)
Requirement already satisfied: numpy>=1.12.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pandas<1.0,>=0.24.0->kedro) (1.14.3)
Requirement already satisfied: pytz>=2011k in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pandas<1.0,>=0.24.0->kedro) (2018.4)
Requirement already satisfied: botocore>=1.12.91 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from s3fs<1.0,>=0.3.0->kedro) (1.15.27)
Requirement already satisfied: mock>=2.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from tables<3.6,>=3.4.4->kedro) (4.0.1)
Requirement already satisfied: numexpr>=2.6.2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from tables<3.6,>=3.4.4->kedro) (2.6.5)
Requirement already satisfied: six>=1.9.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from tables<3.6,>=3.4.4->kedro) (1.11.0)
Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (2019.11.28)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (3.0.4)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (1.23)
Requirement already satisfied: idna<2.8,>=2.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (2.6)
Collecting whichcraft>=0.4.0 (from cookiecutter<2.0,>=1.6.0->kedro)
  Downloading https://files.pythonhosted.org/packages/b5/a2/81887a0dae2e4d2adc70d9a3557fdda969f863ced51cd3c47b587d25bce5/whichcraft-0.6.1-py2.py3-none-any.whl
Collecting future>=0.15.2 (from cookiecutter<2.0,>=1.6.0->kedro)
  Downloading https://files.pythonhosted.org/packages/45/0b/38b06fd9b92dc2b68d58b75f900e97884c45bedd2ff83203d933cf5851c9/future-0.18.2.tar.gz (829kB)
    100% |████████████████████████████████| 829kB 27.8MB/s ta 0:00:01
Collecting poyo>=0.1.0 (from cookiecutter<2.0,>=1.6.0->kedro)
  Downloading https://files.pythonhosted.org/packages/42/50/0b0820601bde2eda403f47b9a4a1f270098ed0dd4c00c443d883164bdccc/poyo-0.5.0-py2.py3-none-any.whl
Collecting binaryornot>=0.2.0 (from cookiecutter<2.0,>=1.6.0->kedro)
  Downloading https://files.pythonhosted.org/packages/24/7e/f7b6f453e6481d1e233540262ccbfcf89adcd43606f44a028d7f5fae5eb2/binaryornot-0.4.4-py2.py3-none-any.whl
Collecting jinja2-time>=0.1.0 (from cookiecutter<2.0,>=1.6.0->kedro)
  Downloading https://files.pythonhosted.org/packages/6a/a1/d44fa38306ffa34a7e1af09632b158e13ec89670ce491f8a15af3ebcb4e4/jinja2_time-0.2.0-py2.py3-none-any.whl
Requirement already satisfied: jinja2>=2.7 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from cookiecutter<2.0,>=1.6.0->kedro) (2.10)
Collecting google-auth-oauthlib (from pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/7b/b8/88def36e74bee9fce511c9519571f4e485e890093ab7442284f4ffaef60b/google_auth_oauthlib-0.4.1-py2.py3-none-any.whl
Collecting google-auth (from pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/05/b0/cc391ebf8ebf7855cdcfe0a9a4cdc8dcd90287c90e1ac22651d104ac6481/google_auth-1.12.0-py2.py3-none-any.whl (83kB)
    100% |████████████████████████████████| 92kB 35.5MB/s ta 0:00:01
Collecting pydata-google-auth (from pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/87/ed/9c9f410c032645632de787b8c285a78496bd89590c777385b921eb89433d/pydata_google_auth-0.3.0-py2.py3-none-any.whl
Requirement already satisfied: setuptools in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pandas-gbq<1.0,>=0.12.0->kedro) (39.1.0)
Collecting google-cloud-bigquery>=1.11.1 (from pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/8f/f7/b6f55e144da37f38a79552a06103f2df4a9569e2dfc6d741a7e2a63d3592/google_cloud_bigquery-1.24.0-py2.py3-none-any.whl (165kB)
    100% |████████████████████████████████| 174kB 39.2MB/s ta 0:00:01
Requirement already satisfied: cryptography in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from azure-storage-common~=1.4->azure-storage-file<2.0,>=1.1.0->kedro) (2.8)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from botocore>=1.12.91->s3fs<1.0,>=0.3.0->kedro) (0.9.4)
Requirement already satisfied: docutils<0.16,>=0.10 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from botocore>=1.12.91->s3fs<1.0,>=0.3.0->kedro) (0.14)
Collecting arrow (from jinja2-time>=0.1.0->cookiecutter<2.0,>=1.6.0->kedro)
  Downloading https://files.pythonhosted.org/packages/92/fa/f84896dede5decf284e6922134bf03fd26c90870bbf8015f4e8ee2a07bcc/arrow-0.15.5-py2.py3-none-any.whl (46kB)
    100% |████████████████████████████████| 51kB 26.3MB/s ta 0:00:01
Requirement already satisfied: MarkupSafe>=0.23 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from jinja2>=2.7->cookiecutter<2.0,>=1.6.0->kedro) (1.0)
Collecting requests-oauthlib>=0.7.0 (from google-auth-oauthlib->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/a3/12/b92740d845ab62ea4edf04d2f4164d82532b5a0b03836d4d4e71c6f3d379/requests_oauthlib-1.3.0-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.2.1 (from google-auth->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/95/de/214830a981892a3e286c3794f41ae67a4495df1108c3da8a9f62159b9a9d/pyasn1_modules-0.2.8-py2.py3-none-any.whl (155kB)
    100% |████████████████████████████████| 163kB 32.5MB/s ta 0:00:01
Requirement already satisfied: rsa<4.1,>=3.1.4 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from google-auth->pandas-gbq<1.0,>=0.12.0->kedro) (3.4.2)
Collecting cachetools<5.0,>=2.0.0 (from google-auth->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/08/6a/abf83cb951617793fd49c98cb9456860f5df66ff89883c8660aa0672d425/cachetools-4.0.0-py3-none-any.whl
Collecting google-api-core<2.0dev,>=1.15.0 (from google-cloud-bigquery>=1.11.1->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/63/7e/a523169b0cc9ce62d56e07571db927286a94b1a5f51ac220bd97db825c77/google_api_core-1.16.0-py2.py3-none-any.whl (70kB)
    100% |████████████████████████████████| 71kB 29.9MB/s ta 0:00:01
Collecting google-cloud-core<2.0dev,>=1.1.0 (from google-cloud-bigquery>=1.11.1->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/89/3c/8a7531839028c9690e6d14c650521f3bbaf26e53baaeb2784b8c3eb2fb97/google_cloud_core-1.3.0-py2.py3-none-any.whl
Requirement already satisfied: protobuf>=3.6.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from google-cloud-bigquery>=1.11.1->pandas-gbq<1.0,>=0.12.0->kedro) (3.6.1)
Collecting google-resumable-media<0.6dev,>=0.5.0 (from google-cloud-bigquery>=1.11.1->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/35/9e/f73325d0466ce5bdc36333f1aeb2892ead7b76e79bdb5c8b0493961fa098/google_resumable_media-0.5.0-py2.py3-none-any.whl
Requirement already satisfied: cffi!=1.11.3,>=1.8 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from cryptography->azure-storage-common~=1.4->azure-storage-file<2.0,>=1.1.0->kedro) (1.11.5)
Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google-auth-oauthlib->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/05/57/ce2e7a8fa7c0afb54a0581b14a65b56e62b5759dbc98e80627142b8a3704/oauthlib-3.1.0-py2.py3-none-any.whl (147kB)
    100% |████████████████████████████████| 153kB 42.0MB/s ta 0:00:01
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pyasn1-modules>=0.2.1->google-auth->pandas-gbq<1.0,>=0.12.0->kedro) (0.4.8)
Collecting googleapis-common-protos<2.0dev,>=1.6.0 (from google-api-core<2.0dev,>=1.15.0->google-cloud-bigquery>=1.11.1->pandas-gbq<1.0,>=0.12.0->kedro)
  Downloading https://files.pythonhosted.org/packages/05/46/168fd780f594a4d61122f7f3dc0561686084319ad73b4febbf02ae8b32cf/googleapis-common-protos-1.51.0.tar.gz
Requirement already satisfied: pycparser in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from cffi!=1.11.3,>=1.8->cryptography->azure-storage-common~=1.4->azure-storage-file<2.0,>=1.1.0->kedro) (2.18)
Building wheels for collected packages: python-json-logger, anyconfig, future, googleapis-common-protos
  Running setup.py bdist_wheel for python-json-logger ... done
  Stored in directory: /home/ec2-user/.cache/pip/wheels/97/f7/a1/752e22bb30c1cfe38194ea0070a5c66e76ef4d06ad0c7dc401
  Running setup.py bdist_wheel for anyconfig ... done
  Stored in directory: /home/ec2-user/.cache/pip/wheels/5a/82/0d/e374b7c77f4e4aa846a9bc2057e1d108c7f8e6b97a383befc9
  Running setup.py bdist_wheel for future ... done
  Stored in directory: /home/ec2-user/.cache/pip/wheels/8b/99/a0/81daf51dcd359a9377b110a8a886b3895921802d2fc1b2397e
  Running setup.py bdist_wheel for googleapis-common-protos ... done
  Stored in directory: /home/ec2-user/.cache/pip/wheels/2c/f9/7f/6eb87e636072bf467e25348bbeb96849333e6a080dca78f706
Successfully built python-json-logger anyconfig future googleapis-common-protos
cookiecutter 1.7.0 has requirement click>=7.0, but you'll have click 6.7 which is incompatible.
google-auth 1.12.0 has requirement setuptools>=40.3.0, but you'll have setuptools 39.1.0 which is incompatible.
google-cloud-bigquery 1.24.0 has requirement six<2.0.0dev,>=1.13.0, but you'll have six 1.11.0 which is incompatible.
pip-tools 4.5.1 has requirement click>=7, but you'll have click 6.7 which is incompatible.
Installing collected packages: azure-common, azure-storage-common, azure-storage-file, fsspec, s3fs, tables, toposort, azure-storage-queue, whichcraft, future, poyo, binaryornot, arrow, jinja2-time, cookiecutter, pyasn1-modules, cachetools, google-auth, oauthlib, requests-oauthlib, google-auth-oauthlib, pydata-google-auth, googleapis-common-protos, google-api-core, google-cloud-core, google-resumable-media, google-cloud-bigquery, pandas-gbq, pip-tools, azure-storage-blob, pyarrow, python-json-logger, anyconfig, kedro
  Found existing installation: s3fs 0.1.5
    Uninstalling s3fs-0.1.5:
      Successfully uninstalled s3fs-0.1.5
  Found existing installation: tables 3.4.3
    Uninstalling tables-3.4.3:
      Successfully uninstalled tables-3.4.3
Successfully installed anyconfig-0.9.10 arrow-0.15.5 azure-common-1.1.25 azure-storage-blob-1.5.0 azure-storage-common-1.4.2 azure-storage-file-1.4.0 azure-storage-queue-1.4.0 binaryornot-0.4.4 cachetools-4.0.0 cookiecutter-1.7.0 fsspec-0.7.1 future-0.18.2 google-api-core-1.16.0 google-auth-1.12.0 google-auth-oauthlib-0.4.1 google-cloud-bigquery-1.24.0 google-cloud-core-1.3.0 google-resumable-media-0.5.0 googleapis-common-protos-1.51.0 jinja2-time-0.2.0 kedro-0.15.8 oauthlib-3.1.0 pandas-gbq-0.13.1 pip-tools-4.5.1 poyo-0.5.0 pyarrow-0.16.0 pyasn1-modules-0.2.8 pydata-google-auth-0.3.0 python-json-logger-0.1.11 requests-oauthlib-1.3.0 s3fs-0.4.2 tables-3.5.2 toposort-1.5 whichcraft-0.6.1

pip install kedro output from terminal

Collecting kedro
  Using cached kedro-0.15.8-py3-none-any.whl (12.5 MB)
Collecting pandas<1.0,>=0.24.0
  Downloading pandas-0.25.3-cp36-cp36m-manylinux1_x86_64.whl (10.4 MB)
     |████████████████████████████████| 10.4 MB 9.6 MB/s 
Collecting azure-storage-file<2.0,>=1.1.0
  Using cached azure_storage_file-1.4.0-py2.py3-none-any.whl (30 kB)
Collecting click<8.0
  Downloading click-7.1.1-py2.py3-none-any.whl (82 kB)
     |████████████████████████████████| 82 kB 1.7 MB/s 
Collecting cookiecutter<2.0,>=1.6.0
  Using cached cookiecutter-1.7.0-py2.py3-none-any.whl (40 kB)
Collecting SQLAlchemy<2.0,>=1.2.0
  Downloading SQLAlchemy-1.3.15.tar.gz (6.1 MB)
     |████████████████████████████████| 6.1 MB 49.2 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Collecting tables<3.6,>=3.4.4
  Using cached tables-3.5.2-cp36-cp36m-manylinux1_x86_64.whl (4.3 MB)
Processing /home/ec2-user/.cache/pip/wheels/97/f7/a1/752e22bb30c1cfe38194ea0070a5c66e76ef4d06ad0c7dc401/python_json_logger-0.1.11-py2.py3-none-any.whl
Collecting azure-storage-blob<2.0,>=1.1.0
  Using cached azure_storage_blob-1.5.0-py2.py3-none-any.whl (75 kB)
Collecting pandas-gbq<1.0,>=0.12.0
  Using cached pandas_gbq-0.13.1-py3-none-any.whl (23 kB)
Requirement already satisfied: fsspec<1.0,>=0.5.1 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from kedro) (0.6.3)
Collecting xlsxwriter<2.0,>=1.0.0
  Downloading XlsxWriter-1.2.8-py2.py3-none-any.whl (141 kB)
     |████████████████████████████████| 141 kB 65.9 MB/s 
Collecting pip-tools<5.0.0,>=4.0.0
  Using cached pip_tools-4.5.1-py2.py3-none-any.whl (41 kB)
Collecting pyarrow<1.0.0,>=0.12.0
  Downloading pyarrow-0.16.0-cp36-cp36m-manylinux2014_x86_64.whl (63.1 MB)
     |████████████████████████████████| 63.1 MB 25 kB/s 
Collecting xlrd<2.0,>=1.0.0
  Downloading xlrd-1.2.0-py2.py3-none-any.whl (103 kB)
     |████████████████████████████████| 103 kB 66.5 MB/s 
Requirement already satisfied: s3fs<1.0,>=0.3.0 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from kedro) (0.4.0)
Collecting azure-storage-queue<2.0,>=1.1.0
  Using cached azure_storage_queue-1.4.0-py2.py3-none-any.whl (23 kB)
Processing /home/ec2-user/.cache/pip/wheels/5a/82/0d/e374b7c77f4e4aa846a9bc2057e1d108c7f8e6b97a383befc9/anyconfig-0.9.10-py2.py3-none-any.whl
Collecting toposort<2.0,>=1.5
  Using cached toposort-1.5-py2.py3-none-any.whl (7.6 kB)
Requirement already satisfied: PyYAML<6.0,>=4.2 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from kedro) (5.3.1)
Requirement already satisfied: requests<3.0,>=2.20.0 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from kedro) (2.23.0)
Requirement already satisfied: pytz>=2017.2 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from pandas<1.0,>=0.24.0->kedro) (2019.3)
Requirement already satisfied: numpy>=1.13.3 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from pandas<1.0,>=0.24.0->kedro) (1.18.1)
Requirement already satisfied: python-dateutil>=2.6.1 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from pandas<1.0,>=0.24.0->kedro) (2.8.1)
Collecting azure-common>=1.1.5
  Using cached azure_common-1.1.25-py2.py3-none-any.whl (12 kB)
Collecting azure-storage-common~=1.4
  Using cached azure_storage_common-1.4.2-py2.py3-none-any.whl (47 kB)
Collecting poyo>=0.1.0
  Using cached poyo-0.5.0-py2.py3-none-any.whl (10 kB)
Collecting jinja2-time>=0.1.0
  Using cached jinja2_time-0.2.0-py2.py3-none-any.whl (6.4 kB)
Collecting whichcraft>=0.4.0
  Using cached whichcraft-0.6.1-py2.py3-none-any.whl (5.2 kB)
Collecting binaryornot>=0.2.0
  Using cached binaryornot-0.4.4-py2.py3-none-any.whl (9.0 kB)
Requirement already satisfied: jinja2>=2.7 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from cookiecutter<2.0,>=1.6.0->kedro) (2.11.1)
Processing /home/ec2-user/.cache/pip/wheels/8b/99/a0/81daf51dcd359a9377b110a8a886b3895921802d2fc1b2397e/future-0.18.2-cp36-none-any.whl
Requirement already satisfied: mock>=2.0 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from tables<3.6,>=3.4.4->kedro) (3.0.5)
Collecting numexpr>=2.6.2
  Downloading numexpr-2.7.1-cp36-cp36m-manylinux1_x86_64.whl (162 kB)
     |████████████████████████████████| 162 kB 66.7 MB/s 
Requirement already satisfied: six>=1.9.0 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from tables<3.6,>=3.4.4->kedro) (1.14.0)
Collecting pydata-google-auth
  Using cached pydata_google_auth-0.3.0-py2.py3-none-any.whl (12 kB)
Collecting google-auth-oauthlib
  Using cached google_auth_oauthlib-0.4.1-py2.py3-none-any.whl (18 kB)
Collecting google-cloud-bigquery>=1.11.1
  Using cached google_cloud_bigquery-1.24.0-py2.py3-none-any.whl (165 kB)
Collecting google-auth
  Using cached google_auth-1.12.0-py2.py3-none-any.whl (83 kB)
Requirement already satisfied: setuptools in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from pandas-gbq<1.0,>=0.12.0->kedro) (46.1.1.post20200323)
Requirement already satisfied: boto3>=1.9.91 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from s3fs<1.0,>=0.3.0->kedro) (1.12.27)
Requirement already satisfied: botocore>=1.12.91 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from s3fs<1.0,>=0.3.0->kedro) (1.15.27)
Requirement already satisfied: idna<3,>=2.5 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (2.9)
Requirement already satisfied: chardet<4,>=3.0.2 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (1.22)
Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from requests<3.0,>=2.20.0->kedro) (2019.11.28)
Requirement already satisfied: cryptography in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from azure-storage-common~=1.4->azure-storage-file<2.0,>=1.1.0->kedro) (2.8)
Collecting arrow
  Using cached arrow-0.15.5-py2.py3-none-any.whl (46 kB)
Requirement already satisfied: MarkupSafe>=0.23 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from jinja2>=2.7->cookiecutter<2.0,>=1.6.0->kedro) (1.1.1)
Collecting requests-oauthlib>=0.7.0
  Using cached requests_oauthlib-1.3.0-py2.py3-none-any.whl (23 kB)
Collecting google-resumable-media<0.6dev,>=0.5.0
  Using cached google_resumable_media-0.5.0-py2.py3-none-any.whl (38 kB)
Collecting google-cloud-core<2.0dev,>=1.1.0
  Using cached google_cloud_core-1.3.0-py2.py3-none-any.whl (26 kB)
Requirement already satisfied: protobuf>=3.6.0 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from google-cloud-bigquery>=1.11.1->pandas-gbq<1.0,>=0.12.0->kedro) (3.11.3)
Collecting google-api-core<2.0dev,>=1.15.0
  Using cached google_api_core-1.16.0-py2.py3-none-any.whl (70 kB)
Collecting pyasn1-modules>=0.2.1
  Using cached pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
Requirement already satisfied: rsa<4.1,>=3.1.4 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from google-auth->pandas-gbq<1.0,>=0.12.0->kedro) (3.4.2)
Collecting cachetools<5.0,>=2.0.0
  Using cached cachetools-4.0.0-py3-none-any.whl (10 kB)
Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from boto3>=1.9.91->s3fs<1.0,>=0.3.0->kedro) (0.3.3)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from boto3>=1.9.91->s3fs<1.0,>=0.3.0->kedro) (0.9.4)
Requirement already satisfied: docutils<0.16,>=0.10 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from botocore>=1.12.91->s3fs<1.0,>=0.3.0->kedro) (0.15.2)
Requirement already satisfied: cffi!=1.11.3,>=1.8 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from cryptography->azure-storage-common~=1.4->azure-storage-file<2.0,>=1.1.0->kedro) (1.14.0)
Collecting oauthlib>=3.0.0
  Using cached oauthlib-3.1.0-py2.py3-none-any.whl (147 kB)
Processing /home/ec2-user/.cache/pip/wheels/2c/f9/7f/6eb87e636072bf467e25348bbeb96849333e6a080dca78f706/googleapis_common_protos-1.51.0-cp36-none-any.whl
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from pyasn1-modules>=0.2.1->google-auth->pandas-gbq<1.0,>=0.12.0->kedro) (0.4.8)
Requirement already satisfied: pycparser in /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages (from cffi!=1.11.3,>=1.8->cryptography->azure-storage-common~=1.4->azure-storage-file<2.0,>=1.1.0->kedro) (2.20)
Building wheels for collected packages: SQLAlchemy
  Building wheel for SQLAlchemy (PEP 517) ... done
  Created wheel for SQLAlchemy: filename=SQLAlchemy-1.3.15-cp36-cp36m-linux_x86_64.whl size=1215829 sha256=112167e02a19acada7f367d8aca55bbd1e0c655de9edfabebae5e9d055d9a9a6
  Stored in directory: /home/ec2-user/.cache/pip/wheels/4a/1b/3a/c73044d7be48baeb47cbee343334f7803726ca1e9ba7b29095
Successfully built SQLAlchemy
Installing collected packages: pandas, azure-common, azure-storage-common, azure-storage-file, click, poyo, arrow, jinja2-time, whichcraft, binaryornot, future, cookiecutter, SQLAlchemy, numexpr, tables, python-json-logger, azure-storage-blob, pyasn1-modules, cachetools, google-auth, oauthlib, requests-oauthlib, google-auth-oauthlib, pydata-google-auth, google-resumable-media, googleapis-common-protos, google-api-core, google-cloud-core, google-cloud-bigquery, pandas-gbq, xlsxwriter, pip-tools, pyarrow, xlrd, azure-storage-queue, anyconfig, toposort, kedro
  Attempting uninstall: pandas
    Found existing installation: pandas 0.22.0
    Uninstalling pandas-0.22.0:
      Successfully uninstalled pandas-0.22.0
Successfully installed SQLAlchemy-1.3.15 anyconfig-0.9.10 arrow-0.15.5 azure-common-1.1.25 azure-storage-blob-1.5.0 azure-storage-common-1.4.2 azure-storage-file-1.4.0 azure-storage-queue-1.4.0 binaryornot-0.4.4 cachetools-4.0.0 click-7.1.1 cookiecutter-1.7.0 future-0.18.2 google-api-core-1.16.0 google-auth-1.12.0 google-auth-oauthlib-0.4.1 google-cloud-bigquery-1.24.0 google-cloud-core-1.3.0 google-resumable-media-0.5.0 googleapis-common-protos-1.51.0 jinja2-time-0.2.0 kedro-0.15.8 numexpr-2.7.1 oauthlib-3.1.0 pandas-0.25.3 pandas-gbq-0.13.1 pip-tools-4.5.1 poyo-0.5.0 pyarrow-0.16.0 pyasn1-modules-0.2.8 pydata-google-auth-0.3.0 python-json-logger-0.1.11 requests-oauthlib-1.3.0 tables-3.5.2 toposort-1.5 whichcraft-0.6.1 xlrd-1.2.0 xlsxwriter-1.2.8

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

environment terminal notebook
kedro -V kedro, version 0.15.8 kedro, version 0.15.8
python -V Python 3.6.10 :: Anaconda, Inc. Python 3.6.5 :: Anaconda, Inc.
os PRETTY_NAME="Amazon Linux AMI 2018.03"" ID_LIKE="rhel fedora" PRETTY_NAME="Amazon Linux AMI 2018.03"" ID_LIKE="rhel fedora"
pip freeze anyconfig==0.9.10
arrow==0.15.5
asn1crypto==1.3.0
attrs==19.3.0
autovizwidget==0.12.9
awscli==1.18.27
azure-common==1.1.25
azure-storage-blob==1.5.0
azure-storage-common==1.4.2
azure-storage-file==1.4.0
azure-storage-queue==1.4.0
backcall==0.1.0
bcrypt==3.1.7
binaryornot==0.4.4
bleach==3.1.0
boto3==1.12.27
botocore==1.15.27
cached-property==1.5.1
cachetools==4.0.0
certifi==2019.11.28
cffi==1.14.0
chardet==3.0.4
click==7.1.1
colorama==0.4.3
cookiecutter==1.7.0
cryptography==2.8
decorator==4.4.2
defusedxml==0.6.0
docker==4.2.0
docker-compose==1.25.4
dockerpty==0.4.1
docopt==0.6.2
docutils==0.15.2
entrypoints==0.3
environment-kernels==1.1.1
fsspec==0.6.3
future==0.18.2
gitdb==4.0.2
GitPython==3.1.0
google-api-core==1.16.0
google-auth==1.12.0
google-auth-oauthlib==0.4.1
google-cloud-bigquery==1.24.0
google-cloud-core==1.3.0
google-resumable-media==0.5.0
googleapis-common-protos==1.51.0
hdijupyterutils==0.12.9
idna==2.9
importlib-metadata==1.5.0
ipykernel==5.1.4
ipython==7.13.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
jedi==0.16.0
Jinja2==2.11.1
jinja2-time==0.2.0
jmespath==0.9.4
json5==0.9.3
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==6.0.0
jupyter-console==6.1.0
jupyter-core==4.6.1
jupyterlab==1.2.7
jupyterlab-git==0.9.0
jupyterlab-server==1.0.7
kedro==0.15.8
MarkupSafe==1.1.1
mistune==0.8.4
mock==3.0.5
nb-conda==2.2.1
nb-conda-kernels==2.2.3
nbconvert==5.6.1
nbdime==2.0.0
nbexamples==0.0.0
nbformat==5.0.4
nbserverproxy==0.3.2
nose==1.3.7
notebook==5.7.8
numexpr==2.7.1
numpy==1.18.1
oauthlib==3.1.0
packaging==20.3
pandas==0.25.3
pandas-gbq==0.13.1
pandocfilters==1.4.2
paramiko==2.7.1
parso==0.6.2
pexpect==4.8.0
pickleshare==0.7.5
pid==3.0.0
pip-tools==4.5.1
plotly==4.5.4
poyo==0.5.0
prometheus-client==0.7.1
prompt-toolkit==3.0.3
protobuf==3.11.3
protobuf3-to-dict==0.1.5
psutil==5.7.0
psycopg2==2.8.4
ptyprocess==0.6.0
py4j==0.10.7
pyarrow==0.16.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
pydata-google-auth==0.3.0
pygal==2.4.0
Pygments==2.6.1
pykerberos==1.1.14
PyNaCl==1.3.0
pyOpenSSL==19.1.0
pyparsing==2.4.6
pyrsistent==0.15.7
PySocks==1.7.1
pyspark==2.3.2
python-dateutil==2.8.1
python-json-logger==0.1.11
pytz==2019.3
PyYAML==5.3.1
pyzmq==18.1.1
qtconsole==4.7.1
QtPy==1.9.0
requests==2.23.0
requests-kerberos==0.12.0
requests-oauthlib==1.3.0
retrying==1.3.3
rsa==3.4.2
s3fs==0.4.0
s3transfer==0.3.3
sagemaker==1.51.4
sagemaker-experiments==0.1.10
sagemaker-nbi-agent==1.0
sagemaker-pyspark==1.2.8
scipy==1.4.1
Send2Trash==1.5.0
six==1.14.0
smdebug-rulesconfig==0.1.2
smmap==3.0.1
sparkmagic==0.15.0
SQLAlchemy==1.3.15
tables==3.5.2
terminado==0.8.3
testpath==0.4.4
texttable==1.6.2
toposort==1.5
tornado==6.0.4
traitlets==4.3.3
urllib3==1.22
wcwidth==0.1.8
webencodings==0.5.1
websocket-client==0.57.0
whichcraft==0.6.1
widgetsnbextension==3.5.1
xlrd==1.2.0
XlsxWriter==1.2.8
zipp==2.2.0
alabaster==0.7.10
anaconda-client==1.6.14
anaconda-project==0.8.2
anyconfig==0.9.10
arrow==0.15.5
asn1crypto==0.24.0
astroid==1.6.3
astropy==3.0.2
attrs==18.1.0
Automat==0.3.0
autovizwidget==0.15.0
awscli==1.18.27
azure-common==1.1.25
azure-storage-blob==1.5.0
azure-storage-common==1.4.2
azure-storage-file==1.4.0
azure-storage-queue==1.4.0
Babel==2.5.3
backcall==0.1.0
backports.shutil-get-terminal-size==1.0.0
bcrypt==3.1.7
beautifulsoup4==4.6.0
binaryornot==0.4.4
bitarray==0.8.1
bkcharts==0.2
blaze==0.11.3
bleach==2.1.3
bokeh==1.0.4
boto==2.48.0
boto3==1.12.27
botocore==1.15.27
Bottleneck==1.2.1
cached-property==1.5.1
cachetools==4.0.0
certifi==2019.11.28
cffi==1.11.5
characteristic==14.3.0
chardet==3.0.4
click==6.7
cloudpickle==0.5.3
clyent==1.2.2
colorama==0.3.9
contextlib2==0.5.5
cookiecutter==1.7.0
cryptography==2.8
cycler==0.10.0
Cython==0.28.4
cytoolz==0.9.0.1
dask==0.17.5
datashape==0.5.4
decorator==4.3.0
defusedxml==0.6.0
distributed==1.21.8
docker==4.2.0
docker-compose==1.25.4
dockerpty==0.4.1
docopt==0.6.2
docutils==0.14
entrypoints==0.2.3
enum34==1.1.9
environment-kernels==1.1.1
et-xmlfile==1.0.1
fastcache==1.0.2
filelock==3.0.4
Flask==1.0.2
Flask-Cors==3.0.4
fsspec==0.7.1
future==0.18.2
gevent==1.3.0
glob2==0.6
gmpy2==2.0.8
google-api-core==1.16.0
google-auth==1.12.0
google-auth-oauthlib==0.4.1
google-cloud-bigquery==1.24.0
google-cloud-core==1.3.0
google-resumable-media==0.5.0
googleapis-common-protos==1.51.0
greenlet==0.4.13
h5py==2.8.0
hdijupyterutils==0.15.0
heapdict==1.0.0
html5lib==1.0.1
idna==2.6
imageio==2.3.0
imagesize==1.0.0
importlib-metadata==1.5.0
ipykernel==4.8.2
ipyparallel==6.2.2
ipython==6.4.0
ipython-genutils==0.2.0
ipywidgets==7.4.0
isort==4.3.4
itsdangerous==0.24
jdcal==1.4
jedi==0.12.0
Jinja2==2.10
jinja2-time==0.2.0
jmespath==0.9.4
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.2.3
jupyter-console==5.2.0
jupyter-core==4.4.0
jupyterlab==0.32.1
jupyterlab-launcher==0.10.5
kedro==0.15.8
kiwisolver==1.0.1
lazy-object-proxy==1.3.1
llvmlite==0.23.1
locket==0.2.0
lxml==4.2.1
MarkupSafe==1.0
matplotlib==3.0.3
mccabe==0.6.1
mistune==0.8.3
mkl-fft==1.0.0
mkl-random==1.0.1
mock==4.0.1
more-itertools==4.1.0
mpmath==1.0.0
msgpack==0.6.0
msgpack-python==0.5.6
multipledispatch==0.5.0
nb-conda==2.2.1
nb-conda-kernels==2.2.2
nbconvert==5.4.1
nbformat==4.4.0
networkx==2.1
nltk==3.3
nose==1.3.7
notebook==5.5.0
numba==0.38.0
numexpr==2.6.5
numpy==1.14.3
numpydoc==0.8.0
oauthlib==3.1.0
odo==0.5.1
olefile==0.45.1
opencv-python==3.4.2.17
openpyxl==2.5.3
packaging==20.1
pandas==0.24.2
pandas-gbq==0.13.1
pandocfilters==1.4.2
paramiko==2.7.1
parso==0.2.0
partd==0.3.8
path.py==11.0.1
pathlib2==2.3.2
patsy==0.5.0
pep8==1.7.1
pexpect==4.5.0
pickleshare==0.7.4
Pillow==5.1.0
pip-tools==4.5.1
pkginfo==1.4.2
plotly==4.5.2
pluggy==0.6.0
ply==3.11
poyo==0.5.0
prompt-toolkit==1.0.15
protobuf==3.6.1
protobuf3-to-dict==0.1.5
psutil==5.4.5
psycopg2==2.7.5
ptyprocess==0.5.2
py==1.5.3
py4j==0.10.7
pyarrow==0.16.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.4.0
pycosat==0.6.3
pycparser==2.18
pycrypto==2.6.1
pycurl==7.43.0.1
pydata-google-auth==0.3.0
pyflakes==1.6.0
pygal==2.4.0
Pygments==2.2.0
pykerberos==1.2.1
pylint==1.8.4
PyNaCl==1.3.0
pyodbc==4.0.23
pyOpenSSL==18.0.0
pyparsing==2.2.0
PySocks==1.6.8
pyspark==2.3.2
pytest==3.5.1
pytest-arraydiff==0.2
pytest-astropy==0.3.0
pytest-doctestplus==0.1.3
pytest-openfiles==0.3.0
pytest-remotedata==0.2.1
python-dateutil==2.7.3
python-json-logger==0.1.11
pytz==2018.4
PyWavelets==0.5.2
PyYAML==5.3.1
pyzmq==17.0.0
QtAwesome==0.4.4
qtconsole==4.3.1
QtPy==1.4.1
requests==2.20.0
requests-kerberos==0.12.0
requests-oauthlib==1.3.0
retrying==1.3.3
rope==0.10.7
rsa==3.4.2
ruamel-yaml==0.15.35
s3fs==0.4.2
s3transfer==0.3.3
sagemaker==1.51.4
sagemaker-pyspark==1.2.8
scikit-image==0.13.1
scikit-learn==0.20.3
scipy==1.1.0
seaborn==0.8.1
Send2Trash==1.5.0
simplegeneric==0.8.1
singledispatch==3.4.0.3
six==1.11.0
smdebug-rulesconfig==0.1.2
snowballstemmer==1.2.1
sortedcollections==0.6.1
sortedcontainers==1.5.10
sparkmagic==0.12.5
Sphinx==1.7.4
sphinxcontrib-websupport==1.0.1
spyder==3.2.8
SQLAlchemy==1.2.11
statsmodels==0.9.0
sympy==1.1.1
tables==3.5.2
TBB==0.1
tblib==1.3.2
terminado==0.8.1
testpath==0.3.1
texttable==1.6.2
toolz==0.9.0
toposort==1.5
tornado==5.0.2
traitlets==4.3.2
typing==3.6.4
unicodecsv==0.14.1
urllib3==1.23
wcwidth==0.1.7
webencodings==0.5.1
websocket-client==0.57.0
Werkzeug==0.14.1
whichcraft==0.6.1
widgetsnbextension==3.4.2
wrapt==1.10.11
xlrd==1.1.0
XlsxWriter==1.0.4
xlwt==1.3.0
zict==0.1.3
zipp==3.0.0
WaylonWalker commented 4 years ago

Kedro aside there are a couple of things that you can do to ensure that your environments match from the terminal vs notebook. I am not familiar with the new pandas.CSVDataSet as I am just now starting with my first 0.15.8 myself. We have struggled to get package installs correct through our notebooks, I make sure my team is all using their own environment, created from the terminal.

activate python3 from the terminal before install

Note that the file browser on the left hand side of a SageMaker notebook is really mounted at ~/SageMaker.

source activate python3
# may also be - conda activate python3
# unrelated on windows it was - activate python 3
cd ~/SageMaker/testing/notebooks # this appears to be where your project is
kedro install

install ipykernel in your terminal env

For conda environments to show up in the notebook dropdown selection you will need ipykernel installed. see docs

conda create -n testing python=3.6
pip install ipykernel
# I typically don't have to go this far, but installing ipykernel is recommended by the docs
ipykernel install --user 
cd ~/SageMaker/testing/notebooks # this appears to be where your project is
kedro install

Do note that if you shut down your SageMaker notebook you will loose your packages and environments by default.

I also noticed that you have a difference between pandas. I have no idea if that changes things, but might be a simple fix.

tjcuddihy commented 4 years ago

Your second idea worked @WaylonWalker. I slightly adapted it as it didn't work straight up:

conda create --yes --name kedroenv python=3.6 ipykernel
source activate kedroenv
python -m ipykernel install --user --name kedroenv --display-name "Kedro py3.6"

cd ~/Sagemaker
kedro new # Name testing and example pipeline
cd testing/
kedro run

With a reasonable solution, I'll call this issue closed. Massive thank you @WaylonWalker for pointing me in the right direction.

Cheers, Tom

yetudada commented 3 years ago

@tjcuddihy We're working with the AWS team to produce a knowledge document on using Kedro and Sagemaker. Would we be able to talk to you about how you used them together?

uwaisiqbal commented 3 years ago

I'd be keen on learning more about how to make Sagemaker play nicely with kedro so I can still access everything I need from my kedro context. @yetudada I have an alpha version of a kedro plugin that plays nicely with sagemaker and allows you to run processing jobs.

yetudada commented 3 years ago

@uwaisiqbal then you might be interested in this knowledge article that was just published on AWS: https://aws.amazon.com/blogs/opensource/using-kedro-pipelines-to-train-amazon-sagemaker-models/ 🚀