iamtheyammer / fetch-ford-service-manuals

Downloads HTML and PDF versions of Ford Service Manuals from PTS
GNU General Public License v3.0
32 stars 12 forks source link

PDFs being generated with broken images. #21

Closed gcacciola closed 3 months ago

gcacciola commented 4 months ago

Hello, first of all, congratulations on the awesome tool! Great job and made our lives much easier.

I am from Brazil and I believe I setup everything correctly. I am trying to download the 2024 Ranger Raptor Service manual.

I get the logs as if everything was being downloaded correctly as pdfs, the file structure is created correctly, but once it's finished, I am getting a 50mb total folder size. I thought it was too small to be correct. When I opened the pdfs, the formatting and texts were just fine, but all images were empty with the broken icon inside their frame.

When adding the HTML file output to be saved as well, it seemed to have worked but then I realized the images were not being saved locally. They were linked to where they were hosted on fords servers, so I don't believe It would be useful on the long run.

After a bit of googling (I am no programmer) I think I might have figured out what is happening: The pdfs are being generated before the images are done loading on the page. As I am in Brazil, the latency to fords servers is making the images load a bit longer than expected, I can clearly notice a delay on images load times when navigating through the manual.

I am sorry if I am totally wrong, but I believe it has something to do with playwrights waitUntil load might need some kind of delay or network idle option.

Am I getting somewhere with my assumptions?

Thank you for your patience.

strictlysimple commented 4 months ago

What a strange coincidence, @gcacciola. I literally emailed the developer at the exact moment that you posted this comment about the same exact model. LOL

gcacciola commented 4 months ago

@strictlysimple oh, what a coincidence. When you mean model, you mean the Ranger Raptor right? Are you having the same issues with the images? Where are you located?

strictlysimple commented 4 months ago

The base 2024 Ranger. Based in the US. There's content being downloaded but when you open the PDFs the images are missing, when the terminal gets to the wiring diagrams part it shuts down with an error. So wiring isn't being downloaded at all.

gcacciola commented 4 months ago

@strictlysimple my subscription have already expired, but of course I will subscribe again once we pinpoint the problem. And of course I’ll get @iamtheyammer a coffee as having these files on hand is priceless.

@strictlysimple if your subscription is still valid, have you tried downloading Manuals from another truck? I would try one that have been commented on other issues, like an f150 or something. Just to know if it’s a problem happening only with the 24’s rangers.

strictlysimple commented 4 months ago

Brother I have 3 trucks in the family, I would literally pay him if given the chance haha. I tried a few other earlier models but it seems to be the same issue. @gcacciola

gcacciola commented 4 months ago

Oh, ok, thanks for trying, so it's neither my network latency, nor the truck model, neither having something to do with the manuals being in portuguese. I am running out of guesses. Lol

iamtheyammer commented 4 months ago

Hey all, catching up here. First, I should mention that Playwright (the emulated browser that creates the PDFs) is waiting for all images to load before making the PDF. It is configured to wait for the load event, which is fired when all dependents of the page (including images) have been loaded.

Here's my guess: in the past, Ford hasn't checked to see if you're logged in before sending images, and this may have changed. Since logging in wasn't required the Playwright "browser" used to fetch the workshop manuals (not wiring manuals, they're different) isn't logged in (the cookies aren't configured on it), which would make the image requests unauthenticated and therefore fail (and you'd get that little failed image thing).

So-- in order to verify and fix this I need a set of PTS credentials (username/password). If anyone has a set, please email them to me (email is on my profile). With a set of PTS credentials, I can also fix the issue causing wiring manuals to fail too (PR #17).

strictlysimple commented 4 months ago

Hi @iamtheyammer , my name is Henri. I sent over the credentials via email. I cant tell you how much I appreciate what youre doing for us

iamtheyammer commented 4 months ago

@strictlysimple

Hi @iamtheyammer , my name is Henri. I sent over the credentials via email. I cant tell you how much I appreciate what youre doing for us

Just found your email-- looks like it wound up in spam-- so I'll use those and check this out soon. Thanks!

gcacciola commented 4 months ago

@strictlysimple I was thinking that my credentials were already expired, but I just logged in and noticed they are still valid. When I was about to mail @iamtheyammer the credentials, you were faster than me.

Let me know if you guys need any help.

One quick doubt, I am almost sure I got my cookies information correct, I even tried cookies from another request.

Is it the one that says "Cookie:" and look like this (I deleted some parts of it so it would not be compromised):

CONTENT_PERMISSIONS=permissions=~WS.|~WT.|~|~WR.|~WC.|~WX.|~URETAILER.|~pilot_slts.|~Diagnostics.|~slts.|~WD|~epdikeys|~WW.&expiration=28135040&signature==; CONTENT_AUTH=permissions=&expiration=2024008&signature==

Or the one that says "Set-Cookie:" and look like this:

ak_bmsc=~~/++++/+; Domain=.fordservicecontent.com; Path=/; Expires=Thu, 27 Jun 2024 23:52:55 GMT; Max-Age=7200

iamtheyammer commented 4 months ago

Your cookies are probably in right! The script just doesn't use them when fetching workshop manuals.

iamtheyammer commented 4 months ago

Hey all, I think I've got it!

The fix is on the experimental-wiring branch: if you used git clone to download the repository, just git fetch && git switch experimental-wiring. If you just downloaded a zip with the code, you can do so from here.

Once you've got the new code, I'd recommend starting fresh, including installing yarn dependencies and getting information for the params.json and cookieString.txt files, as they've changed slightly. When you're referencing the README, make sure it's the one on the experimental-wiring branch.

Please let me know how it goes! :fire:

strictlysimple commented 4 months ago

Does not seem to be working, at least on my end. I've attached error log in my email if its of any help. @iamtheyammer

iamtheyammer commented 4 months ago

Going to try using your params.json file now

iamtheyammer commented 4 months ago

Looks like your subscription just expired, so I wasn't able to test it. I did, however, add validation for cookie strings and params-- if you update the code (git pull), it will now tell you that your subscription is expired rather than throwing an error.

strictlysimple commented 4 months ago

@iamtheyammer I've refreshed the subscription, this is what I get now:

yarn start -c templates/params.json -s templates/cookieString.txt -o /home/fm/Desktop/Ranger Processing cookies... Creating a headless chromium instance... Attempting to log into PTS... /home/fm/Downloads/fetch-ford-service-manuals-experimental-wiring/src/index.ts:70 await cookieTestingPage.goto( ^ page.goto: net::ERR_PROXY_CONNECTION_FAILED at https://www.fordtechservice.dealerconnection.com/ Call log:

iamtheyammer commented 4 months ago

Oh sorry-- just fixed it. Try pulling the code and re-running. If you want to work on this in real-time I made a Discord server for this project, too: https://discord.gg/gpfUAsMCGV

jdchaiken commented 3 months ago

Hey all, catching up here. First, I should mention that Playwright (the emulated browser that creates the PDFs) is waiting for all images to load before making the PDF. It is configured to wait for the load event, which is fired when all dependents of the page (including images) have been loaded.

I just did a fresh fetch from the experimental branch and that code is not in there. I now have the wiring, but there are no images in any of the workshop PDFs ( They were missing in the main branch too)

iamtheyammer commented 3 months ago

Merged experimental-wiring into main!