aces / Loris

LORIS is a web-accessible database solution for longitudinal multi-site studies.
GNU General Public License v3.0
145 stars 174 forks source link

Modernize Javascript bundling in LORIS #9424

Open maximemulder opened 1 month ago

maximemulder commented 1 month ago

This issue presents my opinion on the way we bundle Javascript files in LORIS, why I think our approach is flawed, and how I think it should be improved. This issue proposes a lot of different changes, which can be discussed and implemented on an individual basis.

This is a controversial topic that should be discussed at a LORIS meeting.

Current structure

Currently, in LORIS, there are source Javascript files in the following directories:

When building LORIS with Webpack, the following files are produced (minified and accompanied by a source map)

Javascript files are served to the client using the following pattern:

Current directory structure template (source and compiled):

htdocs/
    js/
        components/
            CommonComponent.js
            CommonComponent.js.map
        script.js
jslib/
    commonLib.ts
jsx/
    CommonComponent.tsx
modules/
    module_a/
        js/
            moduleScript.js
            ModuleComponent.js
            ModuleComponent.js.map
        jsx/
            ModuleComponent.tsx
        .gitignore

Problems of the current structure

There are several problems with the current approach. A few minor problems are:

A more fundamental problem is that our approach to bundle and serve Javascript files is archaic IMO: going through PHP to serve Javascript files is not how bundlers are meant to be used nowadays. This results in code and processes that are unnecessarily complex, and that are not in agreement with the field's best practices (which is notably confusing for newcomers).

Typical modern structure

Modern web projects tend to follow two design principles with regards to Javascript files and bundling:

This design philosophy has several advantages:

[^1]: Independently of the permissions, this does not change the fact that the front-end code should not show links to pages a user does not have access to. Those page are theoretically accessible to any user with some reverse engineering, but they would be non-functional as the back-end would not serve data a user does not have access to.

New structure proposal

I propose several changes to gradually improve the LORIS front-end structures. These changes can be discussed and applied individually:

Ideally, the following resulting structure would look like this one:

htdocs/
    js/
        dist/
            modules/
                module_a/
                    moduleScript.js
                    moduleScript.js.map
                    components/
                        ModuleComponent.js
                        ModuleComponent.js.map
            components/
                CommonComponent.js
                CommonComponent.js.map
            commonScript.js
            commonScript.js.map
    .gitignore
js/
    components/
        CommonComponent.tsx
    commonScript.ts
modules/
    module_a/
        js/
            components/
                ModuleComponent.tsx
            moduleScript.ts

Notes

What about modularity ?

Some may think that the proposed change goes against the modularity of LORIS, notably because serving Javascript files directly with Apache overrides the enabled module check present in PHP. However, I argue the following:

  1. Is there any benefit to this behaviour ? The user does not see links to modules they do not have access to anyway, and Javascript files alone (that is, without back-end access to the disabled modules) cannot do anything by themselves.
  2. This view of the server providing or restricting access to front-end files is outdated IMO. Modern web applications are single-page applications (SPA), and although LORIS is not a SPA, the decoupling of front-end and back-end is still a modern web principle that LORIS should follow IMO.

IMO, true modularity would be to not even include the files of disabled modules in the bundle. But I do not think that is worth at this point.

What about security ?

Front-end files such as Javascript files should not contain any sensitive data or mean to perform an unauthorized operation. By enforcing permissions only for back-end endpoints, we reduce the area we have to cover, which may even increase security. Note that the back-end might still send the results of the permission checks to the front-end so that it knows which interface to show to the user (example: which modules does the user have access to).

What about project overrides ?

Project overrides can either be compiled to /htdocs/js/dist/modules/[file].js (replacing the original compiled file) or /htdocs/js/dist/project/modules/[file].js (using a project directory).

ridz1208 commented 1 month ago

Let me start with this... I'm really not an expert in front end code organization and best practices so I may be way off here.

I see some of the advantages of what you are proposing and I don't have a problem with most of it but I'm a bit on the fence about the whole server all the JS regardless of the modules you have access to or not. First because some of our javascript still offers insight on how we do things and could maybe be leveraged by hackers if its just available on the login page? while now you would need to be at least logged in and have access to the module to see the JS code itself (Think GUID generation and storage algorithms potentially exposing source and destination of the data). YES the endpoints have permissions, YES we can modify the code to only provide the data once the module is loaded and not just off the bat... but still thats more info out there than is necessary.... my 2 cents

As for the rest, as I understand it the JSX/TSX code remains under the module directory so I'm okai with that and centralization of the compiled code I'm okai with... but other than the annoyance of adding .gitignore entries I don't really see the gains that justify the work. Unless you tell me that compiling the code will be 10x faster or have some great speed improvements or anything.

maximemulder commented 1 month ago

Regarding the most controversial "compile everything in one place change", it is my "ideal" vision for LORIS as it allows it to be one day converted to a single-page application (which I repeat is the standard for modern web projects), but I guess that may be overly ambitious for now. If I am to take things step by step, it is probably better to have local dist folders in each module, and have all the source code live in the module js directory. The gitignore could also just contain modules/*/dist/ then.

On another note, I spent about two hours recently experimenting with using another bundler with LORIS. I managed to build LORIS in 50s\~1min with Vite, which is about 1.5x\~2x time faster than currently but is less of an improvement that I had hoped for. I guess Rust-based bundlers may be able to significantly improve building times (which Vite intends to move to some day), but those are less mature options for now so I don't think I'll be trying.