perkeep / gphotos-cdp

This program uses the Chrome DevTools Protocol to drive a Chrome session that downloads your photos stored in Google Photos.
Apache License 2.0
651 stars 36 forks source link

Replaced CaptureScreenshot with `document.activeElement.href` #3

Open daneroo opened 4 years ago

daneroo commented 4 years ago

This replaces the implementation of navToEnd which was using CaptureScreenshot to detect scrolling to the end of the Main/Album page. It is replaced with a query to the DOM.

When in the Main/Album Page, the DOM contains <a href=".." /> elements for all visible images. lastPhotoInDOM() simply returns the last such href in document order.

The DOM actually contains more images than those that are visible, in a kind of virtual scrolling window.

In the DOM, but not reflecting exactly the visible photos (actually a superset of the visible elements):

<a href="./photo/AF1QipAAAAAA" aria=label="Photo - Portrait - Jul 15, 2010, 2:10:48 PM"/>
<a href="./photo/AF1QipBBBBBB" aria-label="Photo - Landscape - Jul 15, 2010, 2:03:10 PM"/>
<a href="./photo/AF1QipCCCCCC" aria-label="Video - Landscape - Jul 30, 2010, 7:20:22 PM"/>
...

We tried to find the actual oldest photo by using the aria-label attribute which contains a date for the photo, unfortunately, that label is localised for each user's language which makes the date format very hard to parse.

daneroo commented 4 years ago

I have pushed the two requested changes. (commit: Prefer caller to log errors)

Thanks for your patience, I am just getting used to working remotely, and in smaller increments! I see the value. I also have the -headless PR ready to go. As soon as we merge this, I'll rebase, test and submit

Finally: Bonne année!

mpl commented 4 years ago

I have pushed the two requested changes. (commit: Prefer caller to log errors)

Thanks for your patience, I am just getting used to working remotely, and in smaller increments! I see the value.

yes, the smaller they are, the greater the chance I find a small slice on time to review one and that we make progress on them. I was busy again this week-end, but I should have more time starting in a couple of days and thereafter.

I also have the -headless PR ready to go. As soon as we merge this, I'll rebase, test and submit

excellent

Finally: Bonne année!

Bonne année, bonne santé.

daneroo commented 4 years ago

Do you know why L309-L310 isn't enough to get the very last element though?

Yes, it is because that css selector captures more that the elements that are visible on the screen. There is a complex virtual windowing/scrolling thing going on. It's a fancy DOM version of a buffer overrun!!! Reverse engineering is sooo kludgy.

I revisited all of my assumptions and I implemented a much simpler solution. I think you are gonna like this.

Pushing the update to the PR now.

daneroo commented 4 years ago

Ok, I think this is much better and simpler.

mpl commented 4 years ago

Do you know why L309-L310 isn't enough to get the very last element though?

Yes, it is because that css selector captures more that the elements that are visible on the screen. There is a complex virtual windowing/scrolling thing going on. It's a fancy DOM version of a buffer overrun!!! Reverse engineering is sooo kludgy.

I revisited all of my assumptions and I implemented a much simpler solution. I think you are gonna like this.

  • Removed the screen capture
  • Remove the complex css selector stuff, (yuck, never liked that, too complicated 8-))

I don't remember if I told you, but before I did the screenshots method, one idea I had was to use the presence of their custom scrollbar on the right. As you've probably noticed, as long as you keep scrolling down, it stays visible. And if you stop, it disappears. But, when you reach the bottom of the page, even if you keep on scrolling, it eventually disappears. Unfortunately I did not find out how to "select" it.

  • Simply advance the selection with repeated kb.RightArrow,kb.End.
  • Termination criteria is when active selection stops changing.

Pushing the update to the PR now.

daneroo commented 4 years ago

Thanks for your attention to detail.

daneroo commented 4 years ago

I just update the PR: