Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
585 stars 250 forks source link

FileNotFoundError after executing convert_data_to_tfrecords.py #6

Closed snehankekre closed 4 years ago

snehankekre commented 4 years ago

Bug

Incorrect file name provided to original_data_file on line 30 of convert_data_to_tfrecords.py leads to a FileNotFoundError.

System details

Steps to reproduce

git clone https://github.com/Building-ML-Pipelines/building-machine-learning-pipelines.git cd building-machine-learning-pipelines/chapters/data_ingestion/ python3 convert_data_to_tfrecords.py

2020-07-15 09:21:35.509359: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
  File "/content/building-machine-learning-pipelines/chapters/data_ingestion/convert_data_to_tfrecords.py", line 34, in <module>
    with open(original_data_file) as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/consumer-complaints.csv'

Cause

Incorrect file name provided to original_data_file on line 30 of convert_data_to_tfrecords.py. The data set is stored as consumer-complaints_with_narrative.csv upon download.

Fix

Replace line 30 with the following: original_data_file = "../../data/consumer_complaints_with_narrative.csv"

Expected output

python3 convert_data_to_tfrecords.py

2020-07-15 09:31:39.794495: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
66799it [00:10, 6585.19it/s]
hanneshapke commented 4 years ago

@snehankekre Thank you for the PR and for reporting the issue!