Description of problem and/or code sample that reproduces the issue
Hi, I use mongodump and mongorestore to move libraries in between PCs (let me know if there are easier ways). So for each library (in this case, my library is called "attribution_europe_data"), it has 5 collections (from MongoDB's point of view), which are attribution_europe_data / ....ARCTIC / ....snapshots /...version_nums/...versions, and during the mongodump process, it dumps 2 files for each collection, so a total of 10 files for each library.
I successfully manage to mongorestore those 10 files into a seperate PC. ie I can do things like below (ie I can do things like print(Arctic('localhost')['attribution_data_europe'].list_symbols())
Now, each symbol in my library represents a pandas dataframe (actually they are saved as Blob, since they contain Objects), its around 5000rows x 2000 columns. The issue is if i read it in the new PC, eg "Arctic('localhost')['attribution_europe_data'].read('20220913').data" in Spyder, it will freeze, and eventually "restarting kernel...."
It shouldn't be a memory issue reading that dataframe, as I generated a similar size dataframe randomly in the same PC and it is ok.
As a test, I use the same mongodump and mongorestore method on a smaller / simpler library, of which the library consists of a very simple symbol of a dictionary of {'hi':1}. And the new PC (where I restore it) is is able to read this library and this symbol without any issue. Similarly I use the same method on pure dataframe, as opposed to Blob, it works as well!
So do you think during the mongodump and mongorestore process, it corrupts Blob object?
Also what you guys normally use to transfer arctic libraries from one PC to another? surely there is a simplier way than mongodump and mongorestore?
==============
Just to update on more investigations:
1) if the symbol is a dataframe (that is NOT saved as a blob), it works
2) if the symbol is a dict say {'hi':1}, it works
3) if the symbol is a blob, it DOES NOT work (ie it will have trouble reading that symbol from the restored library in the new PC)
4) if the symbol is a dict wrapped around a pure dataframe, eg {'hi' : pd.DataFrame(np.random.rand(2,2))}, then it works
5) if the symbol is a dict wrapper around a blob, DOES NOT WORK, eg {'hi': some_blob}.
I have included what it looks like in the old PC, and what error it throws up in the new PC if the symbol is a dict wraps around a blob
Arctic Version
Arctic Store
Platform and version
Spyder (Python 3.8)
Description of problem and/or code sample that reproduces the issue
Hi, I use mongodump and mongorestore to move libraries in between PCs (let me know if there are easier ways). So for each library (in this case, my library is called "attribution_europe_data"), it has 5 collections (from MongoDB's point of view), which are attribution_europe_data / ....ARCTIC / ....snapshots /...version_nums/...versions, and during the mongodump process, it dumps 2 files for each collection, so a total of 10 files for each library.
I successfully manage to mongorestore those 10 files into a seperate PC. ie I can do things like below (ie I can do things like print(Arctic('localhost')['attribution_data_europe'].list_symbols())
Now, each symbol in my library represents a pandas dataframe (actually they are saved as Blob, since they contain Objects), its around 5000rows x 2000 columns. The issue is if i read it in the new PC, eg "Arctic('localhost')['attribution_europe_data'].read('20220913').data" in Spyder, it will freeze, and eventually "restarting kernel...."
It shouldn't be a memory issue reading that dataframe, as I generated a similar size dataframe randomly in the same PC and it is ok.
As a test, I use the same mongodump and mongorestore method on a smaller / simpler library, of which the library consists of a very simple symbol of a dictionary of {'hi':1}. And the new PC (where I restore it) is is able to read this library and this symbol without any issue. Similarly I use the same method on pure dataframe, as opposed to Blob, it works as well!
So do you think during the mongodump and mongorestore process, it corrupts Blob object?
Also what you guys normally use to transfer arctic libraries from one PC to another? surely there is a simplier way than mongodump and mongorestore?
============== Just to update on more investigations:
1) if the symbol is a dataframe (that is NOT saved as a blob), it works 2) if the symbol is a dict say {'hi':1}, it works 3) if the symbol is a blob, it DOES NOT work (ie it will have trouble reading that symbol from the restored library in the new PC) 4) if the symbol is a dict wrapped around a pure dataframe, eg {'hi' : pd.DataFrame(np.random.rand(2,2))}, then it works 5) if the symbol is a dict wrapper around a blob, DOES NOT WORK, eg {'hi': some_blob}.
I have included what it looks like in the old PC, and what error it throws up in the new PC if the symbol is a dict wraps around a blob
(old PC)
(new PC)