Closed ritchie46 closed 3 years ago
FYI, the old code will still work for the time being, as we will proxy to the new package name.
I will hopefully merge this week.
@ritchie46 all groupby scripts fails at question2 with
Traceback (most recent call last):
File "./polars/groupby-polars.py", line 60, in <module>
print(ans.head(3), flush=True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 14-33: ord
inal not in range(128)
@ritchie46 all groupby scripts fails at question2 with
Traceback (most recent call last): File "./polars/groupby-polars.py", line 60, in <module> print(ans.head(3), flush=True) UnicodeEncodeError: 'ascii' codec can't encode characters in position 14-33: ord inal not in range(128)
That's strange.
I tested locally on G1_1e7_1e2_5_0
and G1_1e7_1e2_5_0
with polars==0.7.9
and cannot reproduce this error.
Thinking aloud here: It seems that the terminal does not support unicode characters. The library I use for printing tables, uses unicode characters. Maybe I should consider an ascii-complient table format :thinking: .
Does setting the locale at the start of the script make any difference?
import os
import locale
os.environ["PYTHONIOENCODING"] = "utf-8"
myLocale=locale.setlocale(category=locale.LC_ALL, locale="en_GB.UTF-8")
You are right. I forget I had to "patch" python environment before for py-polars package. Now I reinstalled it as polars to new env where I haven't put the patch after. My patch was more dirty than suggestion you are proposing https://github.com/ritchie46/db-benchmark/blob/4a7618f962621984eb657406a603d3787ea0dc12/polars/setup-polars.sh#L25-L34 I will try and if it works there is no need for patching environment anymore. Thanks
Your suggestion didn't work so I hacked polars/py-polars/bin/activate
again and it works now.
Your suggestion didn't work so I hacked polars/py-polars/bin/activate again and it works now.
Ok, but it does work? :)
yes it works and new polars is already running
yes it works and new polars is already running
Great, thanks for your effort!
1e9 groupby is being terminated with timeout, so it takes more than 3h. Tomorrow benchmark run should finish so we will have bigger picture.
1e9 groupby is being terminated with timeout, so it takes more than 3h. Tomorrow benchmark run should finish so we will have bigger picture.
If the questions have not run at all, I think I know what it might be. I was trying to parse the larger datasets from the benchmark and I noticed that the csv-parser on some edge cases scales quadratically.
For the current moment report has not been refreshed. Some checksum of the answers produced by polars has changed and that causes report workflow to raise exception.
Error in model_time(clean_time(load_time(path = path))) :
Value of 'chk' varies for different runs for single solution+question
Calls: <Anonymous> ... withVisible -> eval -> eval -> time_logs -> model_time
I have to review those checksums and invalidate previous ones if needed (or eventually report changed behavior to you). It can take little while. Further discussed in https://github.com/ritchie46/polars/issues/357
If you want to access timings now, then add /time.csv
to report url.
@ritchie46 Using 0.7.11 groupby 1e9 data sizes are being killed by OOM during data load.
We renamed
py-polars
topolars
. This PR points to the newpypi
registry. The old one won't be updated.