Closed SysuJayce closed 1 year ago
Hi @SysuJayce, I appreciate the bug report and sadly this is not the first time this specific issue has presented itself, if I remember correctly. Unfortunately it is very hard to solve without code that consistently reproduces the problem, so I am not sure what I can do here.
Presumably this should take less than 20minutes, yes? Could you try running the task with progress bars disabled?
hi @till-m , I found that disable progress_bar is a workaround. Thanks
Hi @SysuJayce, could you check the following?
Closing for now, feel free to update with more information.
General
Operating System: Centos 7
Python version: 3.8
Pandas version: 2.0.1
Pandarallel version: 1.6.4
Acknowledgement
[x] My issue is NOT present when using
pandas
without alone (withoutpandarallel
)[ ] If I am on Windows, I read the Troubleshooting page before writing a new bug report
Bug description
I started a parallel_apply program with 80 workers to decode and clean a large amount of data(about 50GB), after nearly 8mins, most of them reached 100%, but some got stuck. And after 20mins, the progress_bar is still freeze.
Observed behavior
Progress_bar freezes and cpu usage is 0%
Expected behavior
The process progress should be nearly linear, the program should be finished after arount 10mins according to the progress_bar.
Minimal but working code sample to ease bug fix for
pandarallel
teamWrite here the minimal code sample to ease bug fix for
pandarallel
team