30350n / inventree_part_import

CLI to import parts from suppliers like DigiKey, LCSC, Mouser, etc. to InvenTree
MIT License
24 stars 8 forks source link

A couple of suggestions #10

Closed jonmol closed 3 months ago

jonmol commented 4 months ago

Hi

First, big thanks for this! I've got chaotic parts spread all over and this really helps me actually start using inventree!

While using the script I've been bumping into a few things, and just want to share my thoughts. No problem if you don't want to add anything of it. Also, as you'll notice, Python isn't my language, that's why I didn't make a PR:

First, in some cases I have the actual supplier . For instance it's printed on the bag with the part. It then fails to find the part. What I hacked in was this part in supplier_digikey.py:

    def search(self, search_term):
        for retry in retry_timeouts():
            with retry:
                results = digikey.product_details(search_term,
                    x_digikey_locale_currency=self.currency,
                    x_digikey_locale_site=self.location,
                    x_digikey_locale_language=self.language,
                )

        if results != None:
            product_count = 1
            filtered_results = [ results ]
            exact_matches = [ results ]
        else:
            for retry in retry_timeouts():
                with retry:
                    results = digikey.keyword_search(
                        body=KeywordSearchRequest(keywords=search_term, record_count=10),
                        x_digikey_locale_currency=self.currency,
                        x_digikey_locale_site=self.location,
                        x_digikey_locale_language=self.language,
                    )

            if results.exact_manufacturer_products_count > 0:
                filtered_results = results.exact_manufacturer_products
                product_count = results.exact_manufacturer_products_count
            else:
                filtered_results = [
                    digikey_part for digikey_part in results.products
                    if digikey_part.manufacturer_part_number.lower().startswith(search_term.lower())
                ]
            product_count = results.products_count

            exact_matches = [
                digikey_part for digikey_part in filtered_results
                if digikey_part.manufacturer_part_number.lower() == search_term.lower()
            ]

This isn't really optimal, ideally here I think it should first test with all suppliers if it's an actual supplier product ID, then search the others with the exact manufacturer ID to get new shots at images/data sheets and prices.

Another thing I noticed is that Digikey is somehow blocking media requests, in my case (with the above tweak) I ran with 4878-2N7000TR-ND as the parameter. The media URL returned was https://mm.digikey.com/Volume0/opasdata/d220001/medias/images/3602/MFG_BC327-16.jpg which opens fine in browsers, but fails with non-browsers. I also tried doing the exact same request as my browser, but with curl, and it failed as well.

In any case that made stumble upon this bug:

@cache
def _download_file_content(url):
    session = requests.Session()
    session.mount("https://", TLSv1_2HTTPAdapter())
    print(f"Downloading file:{url}\n")
    try:
        for retry in retry_timeouts():
            with retry:
                result = session.get(url, headers=DOWNLOAD_HEADERS)
                result.raise_for_status()
    except (HTTPError, Timeout) as e:
        warning(f"failed to download file with '{e}'")
        return None

    return result.content, result.url

When getting back 301 forbidden _download_file_content returns None, which makes upload_image crash as it's expecting two variables back:

image_content, redirected_url = _download_file_content(image_url)

I simply changed the except to return None, None and that seems to fix it. Didn't figure out a way to actually get the image sadly.

For Reichelt, it sucks they don't have an API, maybe only use them as a fallback if there are no matches in the others? I have a fair share of parts from them so I like to have them activated (took a while to figure out how to actually activate them btw), but if the part is found elsewhere the data quality is much better there.

Hope I don't come across as pushy. I really appreciate what you've done!

30350n commented 4 months ago

Hey, thanks for using this tool!

Regarding your first code sample, that seems to be based on an outdated version of inventree_part_import. I released 1.5 a couple days ago, which has a ton of fixes and improvements.

Generally, if you use a DigiKey SKU, it should only be matched by DigiKey, unless it somehow matches another MPN (which is relatively unlikely).

Another thing I noticed is that Digikey is somehow blocking media requests, in my case (with the above tweak) I ran with 4878-2N7000TR-ND as the parameter. The media URL returned was https://mm.digikey.com/Volume0/opasdata/d220001/medias/images/3602/MFG_BC327-16.jpg which opens fine in browsers, but fails with non-browsers. I also tried doing the exact same request as my browser, but with curl, and it failed as well.

That's unfortunate, could be that they just recently improved their anti crawling measurements. (The whole TLSv1_2HTTPAdapter + headers thing is in place to circumvent this, but like I said, maybe they changed things). The return None, None thing is a bug, thanks for spotting :)

For Reichelt, it sucks they don't have an API, maybe only use them as a fallback if there are no matches in the others? I have a fair share of parts from them so I like to have them activated (took a while to figure out how to actually activate them btw), but if the part is found elsewhere the data quality is much better there.

If you configure Reichelt as the last supplier in suppliers.yaml that's exactly what will happen. (The suppliers are being used in the order defined in there.)

jonmol commented 4 months ago

Regarding your first code sample, that seems to be based on an outdated version of inventree_part_import. I released 1.5 a couple days ago, which has a ton of fixes and improvements.

Generally, if you use a DigiKey SKU, it should only be matched by DigiKey, unless it somehow matches another MPN (which is relatively unlikely).

Oh, OK. I got "part not found" from any without the changes. I installed using pipx this morning, so I guess the release hasn't been pushed there?

I'll try not being lazy and clone the repo instead to get the latest.

If you configure Reichelt as the last supplier in suppliers.yaml that's exactly what will happen. (The suppliers are being used in the order defined in there.)

Ah, didn't know. Will test that

30350n commented 4 months ago

I installed using pipx this morning, so I guess the release hasn't been pushed there?

It definitely should be, that seems very odd. Have you tried installing this before? To update with pipx you have to use pipx upgrade ... or pipx reinstall ..., otherwise it might keep the old version. 1.5 is definitely available at pypi.

30350n commented 4 months ago

The image downloading code also still seems to work for me btw, maybe you got temporarily blocked or something?

30350n commented 4 months ago

I'll try not being lazy and clone the repo instead to get the latest.

You can also just directly install the newest version from git via pipx btw: pipx install git+https://github.com/30350n/inventree_part_import.git (make sure to uninstall any other versions first).

30350n commented 4 months ago

Generally, if you use a DigiKey SKU, it should only be matched by DigiKey, unless it somehow matches another MPN (which is relatively unlikely).

Just confirmed that this is indeed not working anymore ...

30350n commented 3 months ago

Closing this in favor of #14