jimmejardine / qiqqa-open-source

The open-sourced version of the award-winning Qiqqa research management tool for Windows
GNU General Public License v3.0
380 stars 64 forks source link

upgrade the embedded browser (xulrunner) to the latest version #2

Open jimmejardine opened 5 years ago

GerHobbelt commented 5 years ago

That would mean upgrading both GeckoFX and XulRunner I suppose, as those two are sort of married together if I read the entrails right here, here and here and then more confusing old cruft here.

Extracting my nose out of the divinating bisected chicken, my initial bet would be picking up GeckoFX 60 (or later) and taking it from there, maybe, while praying their blurb about "This repository aims to Firefox 60.0 support" isn't meant too inflexibly this time around...?

GerHobbelt commented 5 years ago

Possible replacement for the whole GeckoFX+XulRunner business: https://github.com/cefsharp/CefSharp/ -- which embeds Chromium

jimmejardine commented 5 years ago

if possible, removing xulrunner would be brilliant i originally tried using a chromium wrapper back in 2015 but it was too unstable so i had to go with xulrunner (blech) by now a WPF binding for chromium must surely be a beaut9iful thing!

GerHobbelt commented 4 years ago

Note to self = @GerHobbelt

Decision: must use CefSharp as that is, at the time of writing, the only viable fully open source solution which is actively maintained and tracking a modern browser: Chrome. The choice has become mandatory as there are regrettable few ways to keep Google Scholar BibTeX access.

Direct BibTeX access is only really viable as we've been used when you are able to 'log in' in the browser, i.e. Google has to know who you are, so that you can adjust your google scholar settings and be identified as not-a-robot in a feasible manner (xulrunner reacts very badly to a google captcha-so-we-can-see-you-are-not-a-bot action from Scholar)

The current work-around used by several people is using a [the side effect of] an anonymity VPN which is a frequent switching between many nodes, hence google will 'see' short Scholar sessions from many IP addresses, while it's you doing a long scholar search session in Qiqqa or otherwise.

However, my expectation is this workaround will become less useful in (near?) future as google and others are very actively collecting VPN endpoint IP addresses and blocking them from scholar access or otherwise thwarting their usefulness -- as I have observed in the months past at time of writing. IMO this is a regrettable side effect of Google fighting anonymity efforts which are counteractive to its data collection and tracking activities which help Google optimize their marketing strategy and picking 'optimal ads' for each of us. But that's just me guessing why I see both immediate captcha or 403 access denied responses coming back plus some other network response oddities when using some VPNs in testing.

This may sound hare-brained ATM, but the bottom line is this: no matter why, but observed fact is that Scholar BibTeX access is blocked/rejected/made bloody tough, unless you are 'recognized' as using a modern browser and being logged in as a known individual. This wasn't the case in 2018AD or before, but it's reality for me at least at end 2019AD. Hence the choice for qiqqa is one where it is paramount to have a modern browser embedded in such a way that cookies can be transported between instances of that embedded webbrowser so that you can go to google, log in, visit Google Scholar in the sniffer and then continue working there, or in other sniffer sessions in the same qiqqa instance, without the need to identify yourself again and again.

CefSharp documentation mentions the possibility of this [cookie sharing], CefSharp in initial tests has turned out to be sufficiently robust to continue testing it, no other open source potential solution has been found which is both current, active and not itself based on CefSharp so we'll go with CefSharp and that's it.

/End of note.

GerHobbelt commented 4 years ago

Addition to note to self above: Microsoft Bing / Academy / what-was-it-called-again?

Microsoft's Scholar alternative offering (forgot the exact name and too lazy to google it now) doesn't even deign to render adequately on current xulrunner; debugging efforts to discover why have been horribly ineffective: several javascript errors and some 'weird shit' (don't ask) has been observed while trying to get a non-all-white page from the Microsoft server in latest and older Qiqqa releases, while the same site (URL incl. query params, etc.) renders fine in latest Chrome browser. (FireFox did give some trouble a few months ago while testing this, but I did not pursue that venue so can't tell if that was 'a glitch' or a more permanent render issue of MS Academy in FF.)

Again, the conclusion is the same: we'll have to ride with CefSharp and hope it stays active for a long time and I can get the proper hooks attached to make the sniffer work again. (There's the unlisted(?) issue of PDF download behaviour for cached visits, which is completely b0rked in xulrunner: https://github.com/jimmejardine/qiqqa-open-source/issues/54


Other bits to mind when migrating: