HDI-Project / ATM

Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).
https://hdi-project.github.io/ATM/
MIT License
527 stars 141 forks source link

enter_data.py file error #91

Closed acudworth3 closed 6 years ago

acudworth3 commented 6 years ago

When running python scripts/enter_data.py I get the following Error. I believe i've installed everything correclty. I see this file not there when I trace the file path. I copied it into the location but got the same error. seems to be an install issue? -replaced my name with XXXXX

Traceback (most recent call last): File "scripts/enter_data.py", line 47, in **vars(args)) File "/Users/XXXXXXXXX/anaconda3/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/config.py", line 541, in load_config with open(log_path) as f: FileNotFoundError: [Errno 2] No such file or directory: '/Users/XXXXXXXX/anaconda3/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/config/templates/log-script.yaml'

micahjsmith commented 6 years ago

What was the full command you were running, and can you provide more details on your installation?

acudworth3 commented 6 years ago

Hey thanks for the response.

I'm at the step to test the installation. Exact command is.

python scripts/enter_data.py

On Sat, Apr 7, 2018 at 12:39 PM, Micah Smith notifications@github.com wrote:

What was the full command you were running?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/HDI-Project/ATM/issues/91#issuecomment-379494076, or mute the thread https://github.com/notifications/unsubscribe-auth/AfwvuQjEhA2FB2NNt0V_G8PxHPbkpOdnks5tmRXzgaJpZM4TLCRD .

acudworth3 commented 6 years ago

Also I am on a mac. I don't know if that is relevant, but it affected the install for mysql. I believe I worked through that (used brew install).

On Sat, Apr 7, 2018 at 12:54 PM, Andrew Cudworth acudworth3@gmail.com wrote:

Hey thanks for the response.

I'm at the step to test the installation. Exact command is.

python scripts/enter_data.py

On Sat, Apr 7, 2018 at 12:39 PM, Micah Smith notifications@github.com wrote:

What was the full command you were running?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/HDI-Project/ATM/issues/91#issuecomment-379494076, or mute the thread https://github.com/notifications/unsubscribe-auth/AfwvuQjEhA2FB2NNt0V_G8PxHPbkpOdnks5tmRXzgaJpZM4TLCRD .

micahjsmith commented 6 years ago

What is your pwd when you run the command?

acudworth3 commented 6 years ago

hey pwd is /Users/XXXXXXX/Desktop/ATM_MIT/atm$ python scripts/enter_data.py

I'd setup a folder specifically to work with this called ATM_MIT. I appreciate the help

The full error when the command is run: Traceback (most recent call last): File "scripts/enter_data.py", line 47, in **vars(args)) File "/Users/XXXXXXXXX/anaconda3/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/config.py", line 541, in load_config with open(log_path) as f: FileNotFoundError: [Errno 2] No such file or directory: '/Users/XXXXXXXXX/anaconda3/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/config/templates/log-script.yaml'

csala commented 6 years ago

I was able to reproduce the error:

$ git clone git@github.com:HDI-Project/ATM.git
$ cd atm
$ mkvirtualenv -p $(which python3.6) -a $(pwd) $(basename $(pwd))
$ pip install -r requirements.txt
$ pip install -r requirements-dev.txt
$ python scripts/enter_data.py
Traceback (most recent call last):
  File "scripts/enter_data.py", line 47, in <module>
    **vars(args))
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/config.py", line 541, in load_config
    with open(log_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/config/templates/log-script.yaml'

The config file log-script.yaml was introduced in this commit: 95f2d321b207642c778f9e5d963dd89ef8296860 So, anything after this commit has this issue.

However, the parent commit, d9600fbd50343c246d4522af08f5043341db476f, has another one, similar:

Traceback (most recent call last):
  File "scripts/enter_data.py", line 44, in <module>
    enter_data(sql_conf, run_conf, aws_conf, args.run_per_partition)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/enter_data.py", line 114, in enter_data
    dataset = create_dataset(db, run_config, aws_config=aws_config)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/enter_data.py", line 39, in create_dataset
    meta = MetaData(run_config.class_column, train_local, test_local)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/encoder.py", line 19, in __init__
    data = pd.read_csv(train_path)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/pandas/io/parsers.py", line 709, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/pandas/io/parsers.py", line 449, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/pandas/io/parsers.py", line 818, in __init__
    self._make_engine(self.engine)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/pandas/io/parsers.py", line 1049, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/pandas/io/parsers.py", line 1695, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: File b'/home/xals/.virtualenvs/atm/lib/python3.6/site-packages/atm-0.0.1-py3.6.egg/atm/data/test/pollution_1.csv' does not exist

So, the problem seems to be that setup.py does not include all the folders inside atm which are not packages: atm/config and atm/data.

@acudworth3 This won't work properly until setup.py is fixed to include those dirs. However, as an interim solution, you can install the package locally using pip instead of setup.py as indicated in the README.md.

To do so, run the following commands inside the repository directory:

pip uninstall -y atm
pip install -e .

After that, it works properly:

$ git checkout master 
Already on 'master'
Your branch is up-to-date with 'origin/master'.
$ pip uninstall -y atm
Uninstalling atm-0.0.1:
  Successfully uninstalled atm-0.0.1
$ pip install -e .
Obtaining file:///home/xals/Projects/Pythia/MIT/atm
Requirement already satisfied: sqlalchemy>=1.1 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (1.1.14)
Requirement already satisfied: numpy>=1.13 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (1.13.1)
Requirement already satisfied: boto>=2.48 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (2.48.0)
Requirement already satisfied: pandas>=0.22 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (0.22.0)
Requirement already satisfied: scikit-learn>=0.18 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (0.18.2)
Requirement already satisfied: scipy>=0.19 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (0.19.1)
Requirement already satisfied: sklearn-pandas>=1.5 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (1.5.0)
Requirement already satisfied: mysqlclient>=1.2 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (1.3.12)
Requirement already satisfied: pyyaml>=3.12 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (3.12)
Requirement already satisfied: joblib>=0.11 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (0.11)
Requirement already satisfied: future>=0.16 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from atm==0.0.1) (0.16.0)
Requirement already satisfied: btb>=0.0.1 in /home/xals/.virtualenvs/atm/src/btb (from atm==0.0.1) (1.0.0)
Requirement already satisfied: python-dateutil>=2 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from pandas>=0.22->atm==0.0.1) (2.7.2)
Requirement already satisfied: pytz>=2011k in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from pandas>=0.22->atm==0.0.1) (2018.3)
Requirement already satisfied: six>=1.5 in /home/xals/.virtualenvs/atm/lib/python3.6/site-packages (from python-dateutil>=2->pandas>=0.22->atm==0.0.1) (1.11.0)
Installing collected packages: atm
  Running setup.py develop for atm
Successfully installed atm
$ python scripts/enter_data.py 
method logreg has 6 hyperpartitions
method dt has 2 hyperpartitions
method knn has 24 hyperpartitions
Data entry complete. Summary:
    Dataset ID: 2
    Training data: /home/xals/Projects/Pythia/MIT/atm/atm/data/test/pollution_1.csv
    Test data: None
    Datarun ID: 2
    Hyperpartition selection strategy: uniform
    Parameter tuning strategy: uniform
    Budget: 100 (classifier)
acudworth3 commented 6 years ago

@csala That worked! thanks so much. Cool project you guys are working on here.