MichaelCurrin / daylio-csv-parser

Improve the usability of the Daylio CSV export and explore reports around your data ☺️ 📆 🐍
https://michaelcurrin.github.io/daylio-csv-parser/
MIT License
28 stars 2 forks source link

I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

Open Tetizera-zz opened 2 years ago

Tetizera-zz commented 2 years ago

Not sure what this is about. I had another error shown to me in a .csv that I edited with Excel. This one, however, is a .csv that I edited with Excel without transforming the data with a PowerQuery, just used Ctrl + F to replace some of the strings.


(venv) Tet@DESKTOP: make csv

python -m dayliopy.clean_csv
Reading CSV: C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy/var/data_in/daylio_export.csv
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 227, in <module>
    main()
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 223, in main
    process(csv_in, csv_out)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 207, in process
    available_activities, in_data = read_csv(csv_in)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 151, in read_csv
    activities_list = process_activities(original_activities_str)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 124, in process_activities
    activities_split = activities_str.split(" | ")
AttributeError: 'NoneType' object has no attribute 'split'
make: *** [Makefile:52: csv] Error 1

Here's my file for debugging:

Additional information:

Windows 10
Python 3.9.10
pip 21.2.4
GNU make 4.3

MichaelCurrin commented 2 years ago

Thanks I'll look into this

MichaelCurrin commented 2 years ago

The CSV looks badly formatted and I can't tell if editing with Excel caused that.

Can you send your original CSV and I can see if the issue is there?

When using bat, I found you have a unicode character at the start, and quoting is applied to the whole row (note yellow quotes at start and end of row).

Screen Shot 2022-02-22 at 7 07 41 pm

I pushed some code to help you with debugging. Here is my result for make csv. Note how most of the values are None and the row is squashed into the first column full_date.

ValueError: The activities column is present but blank - fix the formatting of your CSV - row: 
  {'full_date': '2022-02-20,February 20,Sunday,20:00,good,"","",""', 'date': None, 
  'weekday': None, 'time': None, 'mood': None, 'activities': None, 
   'note_title': None, 'note': None}
make: *** [csv] Error 1
MichaelCurrin commented 2 years ago

I also added a bit so that if activities column is missing, it logs the columns it can see.

Tetizera-zz commented 2 years ago

I'll check if my csv will work this time, but here's the original file with no modifications daylio_export_2022_02_22.csv

EDIT: I can report that this .csv + your code fixes worked. I created config.local.conf with the proper values too.

I had one problem with make mood. The error was UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 83: invalid continuation byte, which led me to this StackOverflow page where the solution was to change the encoding.

What is was: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"])

What I added: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"], encoding = 'latin-1')

I happen to have an issue with make fit, which I'll describe in the next comment

Tetizera-zz commented 2 years ago

So, I had the same encoding issue with make fit, so I changed the .py file. I still get errors though.


python -m dayliopy.fit_model
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 107, in <module>
    main()
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 100, in main
    model = fit(csv_in_path)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 84, in fit
    encoded_df = prepare_data(df)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 53, in prepare_data
    df.set_index("datetime", inplace=True, verify_integrity=True)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\core\frame.py", line 5510, in set_index
    raise ValueError(f"Index has duplicate keys: {duplicates}")
ValueError: Index has duplicate keys: DatetimeIndex(['2021-09-26 20:00:00'], dtype='datetime64[ns]', name='datetime', freq=None)
make: *** [Makefile:58: fit] Error 1```
MichaelCurrin commented 2 years ago

Thanks I'll look at adding the latin encoding myself. Thanks for your file.

Regarding your last comment - the issue is you appear to have two records with the exact same date and time. Would you be willing to change one of your records for 26 Sep with a different time and then run your export again, to avoid that?

MichaelCurrin commented 2 years ago

@Tetizera will you review #25 to fix encoding issue, please?

MichaelCurrin commented 2 years ago

I found a way to stop the duplicate error

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html

Will you review PR #27 for me please?

MichaelCurrin commented 2 years ago

Also for the unicode error for make mood

The error on StackOverflow was for Python 2.

Are you on Python 3? If not, can you make sure you run on Python 3 on master and see if the error goes away?