cs2dsb / lcsc-scrape.rs

Scraper that creates a local SQLite database from parts in the JLCPCB SMT assembly service by looking up parameters from the LCSC product page
Apache License 2.0
17 stars 2 forks source link

The last query command does not work #1

Open JLCPCB opened 3 years ago

JLCPCB commented 3 years ago

Hi,

Very cool project! I tested it and found the last query command should be:

./lcsc-scrape query \
'SELECT part_number, Description, JLC_Description, Datasheet, JLC_Datasheet
FROM parts 
WHERE part_number = "C14867"'

-- Atommann Technical Writer from JLCPCB

KrischnaGabriel commented 3 years ago

I'm having trouble starting the program in the first place. Srsly, why doesn't offer lcsc that option on their webpage? Every other big component distributor has it like mouser or digikey.

cs2dsb commented 3 years ago

Very cool project! I tested it and found the last query command should be:

Thanks, I've updated the README.md

cs2dsb commented 3 years ago

I'm having trouble starting the program in the first place.

Anything I can help with? I've just added a pre-compiled windows binary to the release in case it's windows you are having problems with.

KrischnaGabriel commented 3 years ago

Anything I can help with? I've just added a pre-compiled windows binary to the release in case it's windows you are having problems with.

I had trouble starting the linux version, cuz i'm just getting started with linux and the only linux distro i currently have is Tails, but the windows executable solved that problem. Thanks. Now i'm trying to figure out how i can export a certain group of parts (like, all C0G, NP0 Ceramic Capacitors) to a .csv file so that i can import it to LibreOffice Calc to then do some advanced Price comparison (like, Capacity per $).

The Program is running now, i just don't understand the usage. Can you help?

cs2dsb commented 3 years ago

Yea, so you can run any SQL using the "query" option and if you pipe the output to a file it will produce a CSV.

So for example:

lcsc-scrape.exe query --drop-null-columns "SELECT * FROM parts WHERE Category = \"Power Management ICs/DC-DC Converters\"" > dc_converters.csv

Will produce a csv file containing all the DC-DC converters

KrischnaGabriel commented 3 years ago

I tried but i ran into two different problems: 1: The columns in the .csv file get messed up because there's a comma in the description of some parts which then get identified as a comma for a column. Possible solutions would be to use a different character as a column spacer or to just exclude those columns with commas in it (like the " JLC_Description" column), but i don't know how to do so. screenshot

2: I tried simply replacing the Category (Power Management ICs/DC-DC Converters) with a different one (Multilayer Ceramic Capacitors MLCC - SMD/SMT), but that gave out a completely empty file. Did i messed something up? screenshot2

cs2dsb commented 3 years ago

For the first issue, yes, I should fix that - I'll add it to the todo list. You can work around it by specifying only the columns you want instead of *. So you can do "SELECT part_number, Manufacturer, Description FROM parts WHERE [etc]". You will have to surround the column names in \" quotes if the column name contains a space. Alternatively you could install a SQLite gui tool that handles exporting to CSV correctly - https://sqlitebrowser.org/ for example will let you export any query or the whole table to CSV.

The second one doesn't work because the category needs to include the top level - in this case "Capacitors". So the full category is "Capacitors/Multilayer Ceramic Capacitors MLCC - SMD/SMT".

KrischnaGabriel commented 3 years ago

okay. now it does output a file and also with the right content, but the file only contains 515 parts while according to LCSCs webpage there are 18055 parts in the category of MLCC. In the README.md you wrote “...the cache is completely usable 99% of the time.” but it seems to miss like almost all of LCSCs component library. What's going on here?

cs2dsb commented 3 years ago

The tool only fetches the LCSC info for the parts available in the JLCPCB SMT service. It starts from the JLCPCB spreadsheet to get the LCSC part number and fetches the info for that part from lcsc.com.

To collect all the LCSC info it would have to brute force check every part number starting at 1 and going into the millions. I suspect this would take weeks to run as the request to lcsc takes around 5 seconds.

The tool was designed to make it easy to pick suitable parts for the JLC service so indexing everything on LCSC wasn't a goal. In theory it could be modified to do that but I'd probably want LCSC to give their blessing for the excessive scraping that would be required before going ahead with it...

JLCPCB commented 3 years ago

I wonder if adding a GUI shell for this tool will be more intuitive. Searched rust gui, seems that Rust is not GUI yet.