Closed dconathan closed 7 years ago
I guess the simplest solution would be to just print/warn the message(s) of DatabaseAPI's get & set when didSucceed is False: https://github.com/nextml/NEXT/blob/master/next/database_client/DatabaseAPI.py#L254-L320
This isn't quite sharding. I don't actually think we should address this! Instead it can be made very clear in the docs that one shouldn't store over a certain amount. How do you intend to return the warning to the user?
On Tue, Sep 27, 2016 at 11:10 AM, dconathan notifications@github.com wrote:
I guess the simplest solution would be to just print/warn the message(s) of DatabaseAPI's get & set when didSucceed is False: https://github.com/nextml/NEXT/blob/master/next/ database_client/DatabaseAPI.py#L254-L320
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nextml/NEXT/issues/145#issuecomment-249892696, or mute the thread https://github.com/notifications/unsubscribe-auth/ABWhGEEYov85VoeOggdhM-wPhdhSZ8kSks5quTHPgaJpZM4KHxbd .
Okay I guess I mean partitioning?
What shouldn't we address this? If I have 1 million pieces of data and I need to regularly update some metadata about them (like most recent score from a classifier), it's easiest to store this as one long vector, which can easily get above mongoDB's limits. It's pretty bad when data just disappears. It's really bad when it does so without a warning/error message.
Is the preferable approach to have each of these metadata in a separate doc (so 1 million data would have 1 million docs each containing their most recent score)? Or is the assumption that anyone trying to do work at this scale should implement their own db solution with more capability (e.g. redis)?
Depends on what you mean by "user". I'm thinking of a developer working on/debugging a new application, who presumably has access to the STDOUT of the terminal where the next backend was launched. Is this not how it works when launching on AWS?
If you try to set a key/value pair in the mongoDB with a very large value (like a long string, I'm guessing this is because of the 16mb mongoDB limit), the key never gets set. i.e. attempts to retrieve that value with result in a
None
.Either we should support this (add some sharding capability to DatabaseAPI?), or this should give a warning "Value too big, key/value not set". Right now, there is no warning and can lead to very hard to debug situations...