Open ovresko opened 3 years ago
Hi @ovresko , for information, I have started an alternative lib, oups that has some similarities with pystore
. Please, beware this is my first project, but I would gladly accept any feedback on it.
@ranaroussi, I am aware this post may not be welcome and I am sorry if it is a bit rude. Please, remove it if it does.
based on this code , on each append we load all the data into memory to check for duplicates then doing a write on all the data to rewrite parquet. doing that for some items with 100k existing record with multiple threads, the task is consuming 100% of memory for each 1 record append
why not use fastparquet write method to append the data, (with True / False / overwrite) https://fastparquet.readthedocs.io/en/latest/api.html#fastparquet.write