Open Wh1isper opened 3 days ago
Improving performance for data processing and model trainning by changing code style as a Pythoner.
I saw a lot of implement of operation to dataframe or series using 'for' cycling. In Python, not like C lang, it can cause performance problem. Because of the Python interpreter, 'for' runs slowly than C. But, the pandas are implied by C, so if we use pandas method like 'apply' instead, we are running C to solve the dataframe, which are faster. The pic show the result.
Relative work: #244
Problem
Thanks @cyantangerine in #244:
Proposed Solution
for
is usally bad in performance when handling pandas.dataframe, but unfortunately, there are many times when we don't take that into account.If anyone is interested in helping us improve performance, please feel free to draft a pull reqeust!
Additional context