AmpersandTarski / Ampersand

Build database applications faster than anyone else, and keep your data pollution free as a bonus.
http://ampersandtarski.github.io/
GNU General Public License v3.0
40 stars 8 forks source link

The compiler injects data into PHP-code. #1289

Closed stefjoosten closed 2 years ago

stefjoosten commented 2 years ago

What happened

When I ran a prototype, I got this message:

afbeelding

Clearly, this is a mistake.

Version of ampersand that was used

Steps to reproduce

  1. Clone github.com/AmpersandTarski/Nutwente/commit/fa77022f71fd77538eaa7046fe441d302791f4d6.
  2. in the CLI (which is zsh on MacBook M1): docker compose up -d --build
  3. in the browser (which is Firefox vs. 99.0.1): navigate to localhost

Screenshot / Video

Context / Source of ampersand script

stefjoosten commented 2 years ago

Analysis

I inspected the corresponding Ampersand source code, finding this fragment:

   ROLE ExecEngine MAINTAINS MaakPersoonsTaal
   RULE MaakPersoonsTaal: personalia~;I[AanmeldingVrijwilliger];taal |- talen
   VIOLATION ( TXT "{EX} InsPair;talen;Persoon;",SRC I, TXT ";Tekst;", TGT I )

It is obvious (from the context, claim without evidence) that this fragment is in the blame chain. Also, there is no mistake in this fragment. It's correct.

The error message invited me to dig deeper:

'InsPair(talen,Persoon,Vrijwilliger_7dab---,Tekst,NL,EN,Frans (een beetje) en Duits,)'. InsPair() expects 5 arguments, but you have provided 8

I found the text in one of the spreadsheets, in the target of RELATION taal:

afbeelding

Diagnosis

In the file xlsx.hs, I found the following fragment:

afbeelding

It shows that cellToAtomValues returns the string from the cell, i.e. NL,EN,Frans (een beetje) en Duits,, verbatim as XlsxString orig "NL,EN,Frans (een beetje) en Duits,". This looks right, because it is the upstream direction in which we do not want to do any injection prevention. For that purpose, Let us look at the downstream code. The following code prints cell contents inside the violation text:

afbeelding

So, this is where we need to quote the cell contents to prevent code injection:

hanjoosten commented 2 years ago

Unfortunately, the current implementation of the ExecEngine relies on PHP injection. The use of ENFORCE statements will overcome this. However, in the 'old' rules, we use TXT <String> . The contents of such string relies on being transfered to the execengine as is.

hanjoosten commented 2 years ago

I agree that it is at the downstream side where this should be solved. The point from my above comment is, that in the P-structure, both strings are plain Text fields. So at the downstream solution you have to know if the string should be PHP-encoded. I think the function showPhpStr should be modified. It should differentiate to strings that are prefixed with {EX} and strings that are not. Unfortunately, my Haskell environment is currently broken, so I cannot fix it myself before I fixed my environment. @stefjoosten , hope this helps.

stefjoosten commented 2 years ago

Issue #1291 is blocking the further analysis of this issue.

Michiel-s commented 2 years ago

Hi @stefjoosten and @hanjoosten, you don't have to look into the Compiler code in haskell. The issue here is run time data in combination with the way the Exec Engine works.

The exec engine relies on param separators (default ;). At this moment we don't have an alternative yet.

In Exec engine rules, if you can expects user input containing ;, you need to specify a different separator that is way less likely to be used in user input. E.g. the combination of an underscore followed by a semicolon _;.

You can rewrite your rule like this:

ROLE ExecEngine MAINTAINS MaakPersoonsTaal
   RULE MaakPersoonsTaal: personalia~;I[AanmeldingVrijwilliger];taal |- talen
   VIOLATION ( TXT "{EX}_;InsPair_;talen_;Persoon_;",SRC I, TXT "_;Tekst_;", TGT I )
Michiel-s commented 2 years ago

No fix in compiler needed. Real solution is requested in #800