shadowmoose / pyderman

Install Selenium-compatible Chrome/Firefox/Opera/PhantomJS/Edge webdrivers automatically.
MIT License
29 stars 11 forks source link

url endpoint cache issues with msedgedriver? #32

Open bandophahita opened 1 year ago

bandophahita commented 1 year ago

As I was submitting my PR #31 I noticed that occasionally some of the edge tests would fail due to a mismatch in version expected. I saw you added this to the tests.

    def get_latest_os(self, major: str, _os: str) -> set[str]:
        url = f"https://msedgedriver.azureedge.net/LATEST_RELEASE_{major}_{_os.upper()}"
        opts = set()
        for i in range(15):
            # This endpoint occasionally returns an older cached value,
            # so we have to fish for the cache to make testing more robust and cut down on false errors.
            latest_mac = self.fetch(url)
            opts.add(latest_mac)
            time.sleep(0.25)
        return opts

I'm curious if it was all of the endpoints on that domain or just the LATEST variations? I ask because that same endpoint is used in edge.get_url() which can result in users sometimes getting an error when it tries to download a version that doesn't actually exist.

Case in point https://github.com/bandophahita/pyderman/actions/runs/5084448185/jobs/9136756725 failed...

AssertionError: 'https://msedgedriver.azureedge.net/113.0.1774.57/edgedriver_mac64_m1.zip' != 'https://msedgedriver.azureedge.net/113.0.1774.50/edgedriver_mac64_m1.zip'

It would suggest that 113.0.1774.57 is incorrect as it does not appear in the xml found on https://msedgedriver.azureedge.net However, the link works and the version of the driver matches the filename.

I susepct something more than web host caching is going on...

shadowmoose commented 1 year ago

Good catch, this is certainly possible. Honestly, I haven't looked into it much as I was hoping it was a GitHub-specific CDN thing or a temporary bug with MS. It is possible that the library may need to use alternate methods to find the latest release.

bandophahita commented 1 year ago

I was digging around in https://github.com/SergeyPirogov/webdriver_manager/issues to see if they too were having problems with edge downloads. I found https://github.com/SergeyPirogov/webdriver_manager/issues/302 which talks about the problem but it's unclear to me if/how their code solution solved the problem. As far as I can see, they are basically doing the same thing pyderman is; which is using https://msedgedriver.azureedge.net/LATEST_RELEASE_{major}_{OS.upper()}

You could be right about it being a github CDN thing since I'm not seeing it happen locally. But the fact that I am able to fetch the 'bad' link on my end has me wondering if it is something else... like a bug in how the endpoint works.

bandophahita commented 1 year ago

AH-HA! I found an alternate url endpoint and wanted to see if I get the same answer....

https://msedgedriver.azureedge.net/LATEST_STABLE --> 113.0.1774.50 https://msedgedriver.azureedge.net/LATEST_RELEASE_113 --> 113.0.1774.50

https://msedgewebdriverstorage.blob.core.windows.net/edgewebdriver/LATEST_STABLE --> 113.0.1774.57 https://msedgewebdriverstorage.blob.core.windows.net/edgewebdriver/LATEST_RELEASE_113 --> 113.0.1774.57

I dont know how that explains what we see in tests since we a specifically pointing at msedgedriver.azureedge.net

edit: gave a quick synopsis of the problem to my 15 year old. Even she said "wtf?"

shadowmoose commented 1 year ago

Interesting. I wonder which is more accurate. It seems like they've got something rather strange happening with their asset services.