Closed zenermerps closed 5 months ago
Thanks for reporting and also for presenting the fix right away! ^^
For product page
https://www.lcsc.com/product-detail/Monitors-Reset-Circuits_LOWPOWER-LP5300B6F_C387703.html
, with "datasheet" linkhttps://datasheet.lcsc.com/lcsc/1912111437_LOWPOWER-LP5300B6F_C387703.pdf
Hm I do actually seem to be getting a download redirect from that link too, but I guess the download function doesn't properly pick that up.
I'm replacing "//datasheet.lcsc.com/"
with "//wmsc.lcsc.com/wmsc/upload/file/pdf/v2/"
in LCSC datasheet urls now, could you quickly confirm that this works as intended and fixes the issue?
Tried it with the example I gave above with current git master and it works now, thanks for the quick fix!
Perfect, thanks for confirming!
LCSC recently switched their datasheet links on the main product page to point to a PDF view with an ordering panel to its side. So the datasheet that now gets downloaded by the tool is actually html with .pdf extension. Please update the crawler to actually download the embedded pdf from the site instead of just the HTML.
Edit: The filename of the pdf itself seems to stay consistent from the link in the product page to the actual file, so just the domain and path need to be replaced, e.g.
For product page
https://www.lcsc.com/product-detail/Monitors-Reset-Circuits_LOWPOWER-LP5300B6F_C387703.html
, with "datasheet" linkhttps://datasheet.lcsc.com/lcsc/1912111437_LOWPOWER-LP5300B6F_C387703.pdf
the actual pdf can be found athttps://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/1912111437_LOWPOWER-LP5300B6F_C387703.pdf