Open niclaswue opened 4 months ago
I am not really sure, I had issues in the past when settings the engine explicitly. Maybe it would be smarter in general (in the whole code base) to add the numexpr dependency and adjust the queries where things fail, maybe easier to maintain...
For this specific line you are pointing at, I don't mind adding it. Happy to accept a pull request if you want to give it a shot. (Please run ruff on the file before committing, otherwise CI breaks...)
Yes, that would be a good idea as well. My understanding is that eval is only really useful with the numexpr engine anyway. The pandas documentation states:
'python' : Performs operations as if you had eval’d in top level python. This engine is generally not that useful.
Therefore I would ultimately prefer python code instead of engine="python".
Steps could be:
Anyways, for the above, I added a PR.
Yes thank you, I will leave the issue open in order to remember about that and deal with it when I can.
Hey Xavier,
I just discovered the library and really like it so far, great job! When tinkering a bit, I tried to work with the SCAT dataset and copied this test case into my notebook:
https://github.com/xoolive/traffic/blob/e8cabd6ac167dc2d2508a3dc2d3b91a60884901c/tests/test_datasets.py#L5
to my surprise I got this error:
I tried again in a clean conda environment and the error disappeared. It probably was caused by another package that installed the "numexpr" engine. When calling
.eval()
this is the default evaluation engine (see: pandas docs) and it falls back to "python" when the "numexpr" engine is not found. Some expressions do not seem to be supported by numexpr, therefore to avoid similar issues, I would suggest to explicitly set the engine to "python" for all cases where the numexpr engine will fail. (See also this discussion). This is what fixed the error for me:Example from SCAT:
Alternatively, one could also use python code for these cases directly. This might be more work to implement but ultimately the best option, as it also increases readability and errors are easier to debug. What do you think?