Database API limits - Githubissues

datatalking commented 3 years ago

I've just learned about your project and perhaps this is discussed elsewhere but after reviewing the docs I didn't see where it showed what the limits of the API are.

Questions:

LIMIT - Number of calls each day
LIMIT - Size limit of the database uploaded
LIMIT - Cost for transfer
CONTRIBUTOR - How could I become a contributor to help document things, add notes, fix bugs, and learn. I'm also trying to contribute to the Python project connected to this.

justinclift commented 3 years ago

Good questions. :smile:

For points 1 and 3:

We haven't imposed any limits on # of calls each day, and there isn't any cost (from us) for transfer. From our internal stats, we're not really getting anywhere near usage that would mean we have to introduce limits nor costs. If/when that starts happening, then we'll look into the best way of handling it.

In the meantime, we're upgrading the backend servers anyway to be much more powerful, so that'll push out any potential need for call limits further.

With point 2, it's currently hard coded to 512MB. Uploading a 512MB database can take an awful long time, but the option is there if people want to for some reason. On that note, if someone wants a larger database uploaded, the new servers will be able to handle it (prob up to about 10GB). The current backend servers though would struggle, so we'll leave it as 512MB for now. :wink:

For point 4. To be a contributor you're probably best off to start getting the hang of documenting stuff using the wiki (it's public access, even for writing). We haven't really put enough time and effort into documenting stuff, so it's likely a case of making an initial start, learning from that, an iterating on it.

Does that make sense? :smile:

That aside, where does your general interest in this stuff lay and what skills are you already strong on? Personally, I've found that documenting stuff works pretty well if it ties into an area of interest. Whereas trying to force myself to document stuff that's knid of boring... keeps on getting put off. :wink:

datatalking commented 2 years ago

Most of the areas that I do know of deVops or software or from the algorithm or statistical side of a process.

I spent 12 years in mechanical engineering making working drawings of everything from concrete formwork for pouring foundation, bridges to precision machining to jigs for military aircraft parts.

After 911 I switched to finance since I've been told over analyze everything and loved all of the gathering of data. Spent 7+ years doing Montecarlo multi chain and hidden Markov method analysis for risk tolerance and growth projections. I've been slowly building a algorithmic trading analytical tool and run a stealth start up.

Somewhere in the documentation, automation arena. I am currently wrapping up in undergrad in data analytics and I use a ton of python pandas and SQL in ingesting, cleaning, sorting and organizing files.

I'm good at zooming out to see the macro issues and then zooming in to follow each micro process from start to finish over and over.

captn3m0 commented 1 year ago

Curious to know if the database limit is still 512MB or was it bumped to something higher?

justinclift commented 1 year ago

@captn3m0 It's still 512MB by default, but can be switched off for named users. eg admin staff so far at the moment.

What did you have in mind?

Btw - check your email, if you haven't recently. Emailed you a few times last night about stuff (eg the SQLite zipfile module), but not sure if you're getting them. :smile:

captn3m0 commented 1 year ago

What did you have in mind?

I maintain a dataset of Indian Mutual Funds with historical pricing information that goes back to 2006. It's ~250MB compressed (zsdt), but inflates to ~935MB, and grows to 2GB after index addition. It grows by roughly 50-100MB a year or so.

Would be nice to have it on DBHub, for tracking changes over time more easily.

Thanks for your mails, replied there already!

justinclift commented 1 year ago

No worries, that sounds workable. How often does it get updated?

captn3m0 commented 1 year ago

Once a day.

justinclift commented 1 year ago

Ahhh. At the moment our backend still stores every snapshot as a complete, independent SQLite database file. So that's really more like 2GB * 365 days (per year), until we get around to changing the backend storage to only do differences in some way.

If you do it as a "Live" database however (no historical snapshots though), then the on-disk size would only be that latest size. That'd be way more workable for us in the short/medium term.

Thoughts?

justinclift commented 1 year ago

I guess on the plus side of DBHub.io vs the flatgithub.com approach, is performance. We're using decently specced servers and reasaonble (heh) Go code, so working with the data via the web interface is fairly workable.

On the negative side though, we haven't (yet) hooked up the column filtering part of our data grid layout, so it's not yet possible to type in a search term to filter stuff.

That's likely not a big task in itself, and shouldn't be too far off. Still probably a few weeks away, unless @MKleusberg wants to prioritize it sooner (?). :wink:

captn3m0 commented 1 year ago

Live database would work. The boring changes (pricing data) are versioned inside the database, so they can be tracked with queries. The other changes I want to track (Metadata changes, such as names, or IDs) - I can track elsewhere for now.

justinclift commented 1 year ago

All good. I'll white list you on DBHub.io now.

Did you want the captn3m0 username, or the nemo one (if that gets re-created properly), or both, etc? :smile:

captn3m0 commented 1 year ago

Both would be nice.

justinclift commented 1 year ago

No worries, will do. :smile:

justinclift commented 1 year ago

k, I've added your captn3m0 username to the whitelist. The nemo user can be done once the user is in our system. eg try "signing up" with nemo again, and see if Auth0 likes it this time

sqlitebrowser / dbhub.io

Database API limits #159