Closed wujinghe closed 2 years ago
I personally don't use append that often, so there might something I overlooked. I will take a look when I get a chance
@wujinghe ,
Reserve was indeed unnecessary in append_column. I removed it in master FYI, the append operation, especially if done very frequently, is inefficient by its nature. But I also understand that sometimes your data pattern leaves you no other choice.
In my scenario, my program will receive messages in real time and insert into DataFrame, so I need to call append() for each message. Or you think this scenario is not suitable to use DataFrame.
I guess you usually call load() in your program.
It is definitely suitable. But in any system some operations are less efficient than others. By removing reserve, it should become a lot more efficient. Before it was moving the data on each call
Ok, thanks a lot for your reply.
If I append more than 100,000 rows, the program will run slowly. After profiling, the problem is in append() method which call reserve() every time. When I comment out reserve() it looks fine.
So I want to check if there are other considerations? Maybe considering calling reserve() to limit allocating more memory?