Open calz1 opened 1 year ago
@calz1 Can you run with node index.js --headless=false
? This will open the browser and show what's happening and why the error is coming.
Good idea (but I guess you wrote it ;) )
When doing that, it starts up, brings up Google Photos, opens a photo I recognize, then looks like this:
I tried View Page Source, but it is grayed out. I tried clicking on the file in the lower left and going to "Show in Folder" to see if there was anything in it, but it says "Removed" underneath.
I think I found something! On a whim, I figured I would check and see if there was a really large video or something that it was trying to download and taking awhile. I knew what date it was near because of the year/month folders and I knew what the last successful photo was (a bike trailer) from what was in the folder. I scrolled there and it almost looks like I have a couple corrupt photos in my Google Photos timeline. They have a filename but just display that exclamation logo.
I deleted them and now it is proceeding again. I am going to see if I can retrieve them from an old backup. Perhaps there is a way to detect these corrupt ones so it doesn't halt?
I had a couple more of those apparently corrupt photos as indicated by the exclamation mark. I deleted them and it made it several more months, though now I think I encountered a slightly different problem with a corrupt video. Even though it was uploaded over a decade ago, Google thinks it is still processing and won't let me download. I am going to delete it but it would be cool if these were skipped or threw a warning.
It will be tough since I don't have corrupt videos or images in my Google photo. So I will not be able to test thoroughly. However, I will add some logic.
Thank you! I've encountered a couple more that caused it to freeze and had delete them. Some would even display in the GUI but wouldn't download, so I am not sure what is going on. I have been using Google Photos since it was Picasa Web, so I guess there has been opportunity for different upload methods...
I can confirm this glitch. I'm getting it with GIF files that were uploaded way-back through Picasa and other non-Google-Photos tools.
$ node index.js --headless=false
Starting from: https://photos.google.com/archive/photo/AF1QipNnjf_jzMTNLtzg4AQ5zFLRCRJZV9JoOVW6Kndc
Latest Photo: https://photos.google.com/photo/AF1QipNzt30QkiAqEmXhzx9cFnXlKhi7QhdDSw6mcVAw
-------------------------------------
Metadata not found, trying to get date from html
Download Complete: 1916/4/1916-04-03 Attestation Paper of William Earl Motley Back (508195b).gif
node:internal/process/promises:288
triggerUncaughtException(err, true /* fromPromise */);
^
page.waitForURL: Timeout 30000ms exceeded.
=========================== logs ===========================
waiting for navigation until "load"
============================================================
at file:///home/dajhorn/google-photos-backup/index.js:79:16 {
name: 'TimeoutError'
}
Node.js v18.13.0
The chromium instance is crashing and popping the "Restore pages? Chromium didn't shut down correctly" dialog when restarted.
@dajhorn @calz1, can you make that image sharable and give me access? So I can implement how to skip them.
@vikas5914 Sure, is the email on your GitHub profile good?
https://photos.app.goo.gl/qeBT5NkNyjicsvhM8 https://photos.app.goo.gl/ArDEBxnKfK4Cb1zv7
BTW, my system is Ubuntu 23.04 Lunar Lobster with the google-photos-backup
HEAD installed according to the README.md
page.
@calz1, it shows the album is empty.
@dajhorn I will check the .gif
issue.
Huh, both ways it is empty? Here's what I see:
One question @calz1 @dajhorn: Do you see the Next/Previous button icons at the GIF or the broken image URL?
@calz1 Yeah, it shows blank.
I see Next/Previous buttons when clicking on the broken image in the main list of photos. They work and advance to the next image (which does work).
Do you see the Next/Previous button icons at the GIF or the broken image URL?
On each page, I always see the Next/Previous button, and I never see the broken image icon.
I'm also seeing this same timeout on an old mp4. It plays fine in Photos, but when the script tries to access it, the browser in headful mode reports:
This video-downloads.googleusercontent.com page can’t be found
No webpage was found for the web address: https://video-downloads.googleusercontent.com/snip?authuser=0 HTTP ERROR 404
I can send the video privately if that would help.
@dakahler, what happens when you try to download it manually?
If I select the video and go to download, it works fine.
Also tried clicking the right arrow from the previous photo and pressing Shift+D since that's closer to what the code does, and still works ok.
Odd that the code clicks "left" to get to the video, but the video is a right click with how I have it sorted (newest to oldest). Maybe that has something to do with it.
Also, when I look at the actual download URL for the video, it's completely different than the one that 404s.
Wild, there's a broken link to... something... that only shows up on the chromium browser launched by the tool. Doesn't show up on regular Chrome, Firefox, or latest Chromium.
@dakahler Got it, I had the same assumption. At this moment, I'm investigating whether we can utilize the installed Chrome instead of "Chromium."
Switching to Firefox does get past this particular issue, though it has some other issues.
@dakahler @dajhorn @calz1 Please check the latest code. It will try to download, and on the error, it will skip that URL. (as long as it has a left arrow).
It will also use the installed chrome
instead of the Chorium browser.
The current HEAD gives me a different error:
$ node index.js
Starting from: https://photos.google.com/archive/photo/AF1QipOp0lDRrqBYCEnNtx76gdwvORetDC5NfUu7KwBs
Latest Photo: https://photos.google.com/photo/AF1QipOzVVpRoZ1rs_My0CZe-itlGcRlRE8tuI7CRzM7
-------------------------------------
Metadata not found, trying to get date from html
Download Complete: NaN/NaN/1916-04-03 Attestation Paper of William Earl Motley Front (508195a).gif
node:internal/process/promises:288
triggerUncaughtException(err, true /* fromPromise */);
^
page.waitForURL: Timeout 30000ms exceeded.
=========================== logs ===========================
waiting for navigation until "load"
============================================================
at file:///home/dajhorn/src/google-photos-backup/index.js:80:16 {
name: 'TimeoutError'
}
Node.js v18.13.0
@dajhorn, Can you run as node index.js --headless=false
so You can see whats happeneing at error.? Also, why is there an archive
in your URL?
Also, why is there an
archive
in your URL?
Most of my photographs are archived and available only by clicking the Archive -> Library in the left pane.
Can you run as
node index.js --headless=false
so You can see whats happeneing at error.?
I'm getting these differences in behavior between invocations:
1. node index.js --headless=false
The first image is successfully downloaded, but the left and right arrows do not appear and the second image is never loaded.
2. Pasting the image URL directly into a new interactive browser instance.
The left and right arrows do not appear for old images, and I cannot change images by using the arrow keys.
3. Opening Google Photos Archive interactively.
My archive is so large that it takes several minutes for the timeline to populate; much longer than most people will wait before assuming an error. The archive library page contains only grey boxes and looks like this while the timeline is loading:
If I wait until the oldest images appear in the archive timeline, and then click an old image, then the left and right buttons appear and the arrow keys work too.
My guess is, therefore, that this failure mode is somehow related to the total number of images in a Google Photos account.
☝️ After downloading my oldest photograph from a cold start, Playwright/Chromium must wait seven minutes for Google Photos to return a link to my second-oldest photograph.
Google Photos returns a link to my third-oldest photograph in less than five seconds and runs much faster thereafter.
Interactively, I get proper behavior if I do this:
End
key so that the Archive pane scrolls to the last image.@dajhorn I have not tried with the archive folder. This project was only tested with the direct photo we see when we open Google Photos.
I will check with the archive and see if there are other ways to fix this.
Super cool project! I am looking forward to using this versus processing one of the Google takeouts.
I got about halfway through thousands of photos and ran into an issue. I have read Issue #4 .
node setup.js
, Google Photos opens up fine. I am logged in and able to browse.I immediately run this and get: