sjehuda commented 1 year ago

I think it would be most appropriate to incorporate the capability of the program Open in Browser into metablocks as follows:

// @filetype text/xml text/html
// @filetype application/xml text/html
// @filetype application/rss+xml text/html
// @filetype application/atom+xml text/html

@filetype: @mimetype would be good too text/html: text/plain would be good too

About

This is a request to add new API to the core of the Firemonkey extension that can influence a change in document.contentType.

Preface

I have wrote a userscript that converts structured data files (i.e. XML and JSON) into human readable HTML files.

Overhaul, the program works as expected; documents are processed and forged successfully into HTML.

All the extra functionalities work well when document.contentType (read only) does not contain xml.

However, when document.contentType has xml, any code that contains querySelector would fail because the document is believed to be whatever document.contentType has determined upon initial request (an XML file), and, apparently, the web browser doesn't check document once it has been loaded.

An error message from the program suggests the file processed is XML when in reality it is an HTML, hence hinders other functions to be executed:

Expected Behavior

Execute actions on document as as HTML.

Actual Behavior

Browser blocks HTML actions (throws errors) because it is believed to be an XML.

Script

https://greasyfork.org/en/scripts/465932-newspaper-native-rss-reader

Test Page

https://lzone.de/liferea/blog/feed.xml application/xml

Note

More information at https://github.com/scriptscat/scriptcat/issues/211

erosman commented 1 year ago

However, when document.contentType has xml, any code that contains querySelector would fail

querySelector works on XML.

Here is a test script:

// ==UserScript==
// @name          New metablock @filetype #568
// @match         https://lzone.de/liferea/blog/feed.xml
// @version       1.0
// ==/UserScript==

console.log(document.querySelector('subtitle'));

Scope of UserScripts

As also mentioned in https://github.com/violentmonkey/violentmonkey/issues/1842#issuecomment-1613105667 , I also believe that the scope of userscripts is, and should be, limited to the webpage, as originally intended.

GM functions were added to facilitate some of the common difficulties. HTTP headers are in the scope of the browser. TM provides some functionality beyond the scope of the webpage (e.g. `GM_webRequest, or GM_cookie.list, but that is not supported by other managers.

Allowing userscripts that are meant to be page scripts, access to browser level API can pose a security risk, as well as interfere with functionality of the browser and/or other extensions.

sjehuda commented 1 year ago

document.contentType

querySelector works on XML.

Then something else isn't working the same as it would, if the document.contentType was text/plain.

This is the problem https://github.com/violentmonkey/violentmonkey/issues/1842#issuecomment-1613512833

Even if by setting content-type would be possible arbitrarily, and not being bound to the content-type selections offered by the server, then I would still have this problem.

The following code, won't solve the issue. https://github.com/Tampermonkey/tampermonkey/issues/1809#issuecomment-1616090500

    GM.xmlHttpRequest({
      method: 'GET',
      url: documentURI,
      headers: {
        "Content-Type": "text/plain",
        "Accept": "text/plain"
      },
      onprogress: function(request) {
        request.responseType = 'text';
      },
      onload: function(request) {
        request.overrideMimeType = 'text/plain';
        if (document.URL.startsWith('file:') ||
            request.status == 200) {
          myResolve(request);
        }
        else {
          myReject("File not Found");
        }
      },
      onerror: function(request) {
        myReject('File not Found')
      }
    })

Vendor bullying

Another reason to add the proposed metablock is to overcome bullying from web browser vendors who decide upon technologies they want to censor.

This userscript was born because of the attempt of the red lizard and globe fox to conceal RSS.

They have done it by:

removing the rss icon to bookmarks (2010)
removing rss completely (2018)
forcing download and not allowing to view files of type application/atom+xml application/rss+xml files.
unless they are text/xml or application/xml (2018)
yet they do display json files in a "professional manner"

It makes no sense. Why did they decide this way?

This is censorship of independent media and people under the guise of UI improvements. This can not be misconstrue.

JavaScript scope

However, due to the fact that we can process any file in Javascript, including documents (ODT and PDF), archives (7Zip and Gzip) and even databases, I suggest to add the @filetype / @mimetype metablocks.

erosman commented 1 year ago

In order to investigate further, please provide a minimal userscript that demonstrate the issue.

sjehuda commented 1 year ago

Because, technically, an example userscript won't be of use, the following example should be enough.

If you insist, I will provide you a minimal script.

Suppose I want to turn data into a table or I want to process a web feed, as I do with my script.

Visit this page with Firefox https://reclaimthenet.org/feed You won't be able to process that xml file, because Firefox will force (prompt) you to download that file instead.

sjehuda commented 1 year ago

In order to investigate further, please provide a minimal userscript that demonstrate the issue.

Here is a simple script that doesn't appear to work with xml, because some properties are exclusive to HTML and XML.

Uncaught TypeError: Cannot read property 'clientWidth' of null

Tested with Falkon web browser.

// ==UserScript== // @name Falkon Image Fix // @namespace i2p.schimon.falkon.image // @description Script description // @include * // @version 1.0.0 // ==/UserScript==

if (!document.contentType.startsWith('image/')) { return; };

const ele = document.querySelector('img'); width = ele.clientWidth, height = ele.clientHeight, fileType = ['image/avif', 'image/png', 'image/png', 'image/svg+xml', 'image/webp'];

for (let i = 0; i < fileType.length; i++) { if (document.contentType.match(fileType[i])) { // 808080 525c66 dddddd // Source: /questions/35361986/css-gradient-checkerboard-pattern document.body.style.backgroundImage = 'linear-gradient(45deg,

a3a3a3 25%, transparent 25%), linear-gradient(-45deg, #a3a3a3 25%,

transparent 25%), linear-gradient(45deg, transparent 75%, #a3a3a3 75%), linear-gradient(-45deg, transparent 75%, #a3a3a3 75%)'; document.body.style.backgroundSize = '20px 20px'; document.body.style.backgroundPosition = '0 0, 0 10px, 10px -10px, -10px 0px'; } }

document.title = ${document.title} ${document.contentType}

erosman commented 1 year ago

Here is a simple script that doesn't appear to work with xml, because some properties are exclusive to HTML and XML.

Can you give an example site to test?

The script is meant to run only when tab is showing an image. if (!document.contentType.startsWith('image/')) { return; };

sjehuda commented 1 year ago

PNG (HTML) https://thirdeyemedia.wpmudev.host/wp-content/uploads/sites/7/2020/04/logo_sbb1.png

SVG (XML) https://speek.network/static/img/speeklogo.svg

I tested with Falkon browser.

I don't know what the results with other web browsers.

sjehuda commented 1 year ago

On Wed, 05 Jul 2023 10:30:27 -0700 erosman @.***> wrote:

if (!document.contentType.startsWith('image/')) { return; };

Pleaes comment/delete that first line.

The code should work with WebKit web browsers using console.

erosman commented 1 year ago

Here is a simple script, tested on Firefox.

// ==UserScript== 
// @name          Falkon Image Fix
// @namespace     i2p.schimon.falkon.image
// @description   Script description
// @match         *://*/*
// @version       1.0.0
// ==/UserScript==

// IIF anonymous function wrapper, for error checking & limiting scope
(() => {
  if (!document.contentType.startsWith('image/')) { return; }

  // get the image
  const xml = document.contentType === 'image/svg+xml';
  const img =  xml ? document.documentElement : document.querySelector('img');
  if (!img) { return; }

  // limit the max size
  img.style.maxHeight = '100vh';
  img.style.maxWidth = '100vw';

  // make some display changes
  document.title += ' ' + document.contentType;
})();

sjehuda commented 1 year ago

Yes. It works with Falkon too, both html (png etc.) and xml (svg).

I still insist that there are things that are specific to HTML which can not be done with HTML. See also https://github.com/violentmonkey/violentmonkey/issues/1842#issuecomment-1613512833

Yet, in any case, I will address this:

You won't be able to process that xml file, because Firefox will force (prompt) you to download that file instead.

The problem is really the prompt which doesn't allow to open XML files of Atom, RSS and perhaps other formats like ODT and ODS which can be processed using Javascript.

So I'm still in favor of metablock @filetype.

erosman commented 1 year ago

The problem is really the prompt which doesn't allow to open XML files of Atom, RSS and perhaps other formats like ODT and ODS which can be processed using Javascript

I would fetch the file and process it.

sjehuda commented 1 year ago

I didn't think of doing so.

I'll look into it.

sjehuda commented 1 year ago

Some feeds are located on different servers so I would likely need to use GM.xmlHttpRequest.

First observation

If script is executed from an HTML page, then I can fetch the feed and replace the current page with a processed feed.

Problem 1

Upon pressing button Back, user skips the current page, instead of getting back to that page.

Solution 1

~~Pseudo back button~~ Insert current page to history.

Problem 2

The current page might have Javascript running, and these running scripts can not be turned off.

Conclusion

Rejected.

Second observation

Use GM_openInTab to open feed in new tab.

Problem

That may be annoying.

Third observation

GM_openInTab might not be needed. See bookmarklets at https://www.squarefree.com/bookmarklets/pagelinks.html

erosman commented 1 year ago

Allowing userscripts that are meant to be page scripts, access to browser level API can pose a security risk, as well as interfere with functionality of the browser and/or other extensions.

In any case, as mentioned earlier, changing headers is beyond the scope of userscripts.

sjehuda commented 7 months ago

I have managed to overcome this setback, namely by passing the HTML element newDocument and processing it in all ways feasible (see preProcess) and then replacing document by newDocument. See v24.04.06 vs. v24.04.08.

There was also another matter which required to create an element and change CSS Stylesheet in order to get the expected behaviour, because JavaScript attribute style does not work with document.contentType that ends with XML. See v24.04.08 vs. v24.04.09.

The only functionality which remains not to work upon XML is mode switcher (bright and dark) due to attribute style, yet it might be possible to fix.

erosman / support

[Firemonkey] New metablock @filetype #568

About

Preface

Expected Behavior

Actual Behavior

Script

Test Page

Note

Scope of UserScripts

document.contentType

Vendor bullying

JavaScript scope

a3a3a3 25%, transparent 25%), linear-gradient(-45deg, #a3a3a3 25%,

First observation

Problem 1

Solution 1

Problem 2

Conclusion

Second observation

Problem

Third observation