Bouni / kicad-jlcpcb-tools

Plugin to generate BOM + CPL files for JLCPCB, assigning LCSC part numbers directly from the plugin, query the JLCPCB parts database, lookup datasheets and much more.
MIT License
1.08k stars 102 forks source link

Idea: Hosted Parts DB #450

Open whmountains opened 2 months ago

whmountains commented 2 months ago

Crazy idea but I think it has some promise. Instead of making a downloadable SQLite database, why not host the query api itself?

This has a number of benefits:

I would be happy willing to provide the hosting infrastructure and/or contribute an implementation.

chmorgan commented 2 months ago

Would provide a huge savings in terms of what people needed to download.

If we could count on docker it would be possible to start the docker container up automatically for people that wanted a local database, using the same container that would be running on the hosted service. But I don't think we can count on that infrastructure locally.

whmountains commented 2 months ago

I like the idea of a self-hosted option! But yeah the number of KiCAD users who also have docker available is probably quite low.

Depending on how we decide to implement this search server, it could be a single binary users could download and choose to run. Or the KiCAD plugin could even download it on their behalf.

I saw you mention writing a Rust tool to benchmark the queries. I also a big Rust user and I noticed that the Rust ecosystem is surprisingly strong when it comes to full-text search. From basic libraries like aho-corasic, regex, and fst to full-blown search servers like Sonic, Meili, and Tantivy. I can't help but thinking there must be a simple way to make a "search server" that keeps an in-memory index of all the parts and responds instantly to queries.

By simple I mean "back to basics" kind of simple. If we can come up with an in-memory data structure which aligns with this workload, very good query performance should be possible. For example what if we pre-built our fulltext index using finite-state automata and matched against it using regexes. Apparently it is fast and yet the index is smaller than the original text. But testing is needed to be sure. This is just the type of thing I'm thinking about.

chmorgan commented 2 months ago

A self hosting option addresses people who don't want their searches going cloud side, might work offline (this is likely very rare at the moment), and generally what happens if whomever is hosting decides to stop hosting?

The cloud approach does mean, assuming someone is paying for hosting with enough RAM, as you say the ability to put the whole database in memory for instant searching.

At the moment there isn't a good path to run rust locally across the various OSs and architectures supported by kicad. Ideally you'd target wasm but wasm support for rust io is weak at the moment, so any database crates you used would likely not support building for a wasm target.

As soon as there is a separate codebase for the cloud vs. local imo we'll end up with one of them atrophying and eventually being removed.

If the cloud infrastructure was planning to use containers imo that would at least be a smoothish path for users that wanted to run locally. They'd install docker and the plugin could start a container with the appropriate image.

whmountains commented 2 months ago

Hmmm, I thought rust binaries could be built for windows using Appveyor, but I've never done it and it would be a pain to setup for Windows.

Related: https://users.rust-lang.org/t/can-i-cross-compile-my-rust-binary-for-windows-on-my-linux-system/15731

Bouni commented 2 months ago

I thought about the hosted approach in the past but never had time to implement it. In my opinion a better approach than having every user download a huge database every time. The users without internet connection is an edge case in my opinion (thoughts on that?).

If any body has the time to implement this I would concider merging it as I'm not really happy with the sqlite apporach hosted in chunks on github ...

I could provide hosting as well, just not sure how much RAM we're talking about. I have a bunch of Hetzner cloud instances with 8GB I think that could serve this, preferably as a Docker container.