Open Tetizera-zz opened 2 years ago
Thanks I'll look into this
The CSV looks badly formatted and I can't tell if editing with Excel caused that.
Can you send your original CSV and I can see if the issue is there?
When using bat
, I found you have a unicode character at the start, and quoting is applied to the whole row (note yellow quotes at start and end of row).
I pushed some code to help you with debugging. Here is my result for make csv
. Note how most of the values are None
and the row is squashed into the first column full_date
.
ValueError: The activities column is present but blank - fix the formatting of your CSV - row:
{'full_date': '2022-02-20,February 20,Sunday,20:00,good,"","",""', 'date': None,
'weekday': None, 'time': None, 'mood': None, 'activities': None,
'note_title': None, 'note': None}
make: *** [csv] Error 1
I also added a bit so that if activities column is missing, it logs the columns it can see.
I'll check if my csv will work this time, but here's the original file with no modifications daylio_export_2022_02_22.csv
EDIT: I can report that this .csv + your code fixes worked. I created config.local.conf
with the proper values too.
I had one problem with make mood
. The error was UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 83: invalid continuation byte
, which led me to this StackOverflow page where the solution was to change the encoding.
What is was: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"])
What I added: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"], encoding = 'latin-1')
I happen to have an issue with make fit
, which I'll describe in the next comment
So, I had the same encoding issue with make fit
, so I changed the .py file. I still get errors though.
python -m dayliopy.fit_model
Traceback (most recent call last):
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 107, in <module>
main()
File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 100, in main
model = fit(csv_in_path)
File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 84, in fit
encoded_df = prepare_data(df)
File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 53, in prepare_data
df.set_index("datetime", inplace=True, verify_integrity=True)
File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\core\frame.py", line 5510, in set_index
raise ValueError(f"Index has duplicate keys: {duplicates}")
ValueError: Index has duplicate keys: DatetimeIndex(['2021-09-26 20:00:00'], dtype='datetime64[ns]', name='datetime', freq=None)
make: *** [Makefile:58: fit] Error 1```
Thanks I'll look at adding the latin encoding myself. Thanks for your file.
Regarding your last comment - the issue is you appear to have two records with the exact same date and time. Would you be willing to change one of your records for 26 Sep with a different time and then run your export again, to avoid that?
@Tetizera will you review #25 to fix encoding issue, please?
I found a way to stop the duplicate error
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html
Will you review PR #27 for me please?
Also for the unicode error for make mood
The error on StackOverflow was for Python 2.
Are you on Python 3? If not, can you make sure you run on Python 3 on master and see if the error goes away?
Not sure what this is about. I had another error shown to me in a .csv that I edited with Excel. This one, however, is a .csv that I edited with Excel without transforming the data with a PowerQuery, just used Ctrl + F to replace some of the strings.
(venv) Tet@DESKTOP: make csv
Here's my file for debugging:
Additional information:
Windows 10
Python 3.9.10
pip 21.2.4
GNU make 4.3