sanskrit-lexicon / csl-websanlexicon

0 stars 1 forks source link

key and xss #27

Open funderburkjim opened 1 year ago

funderburkjim commented 1 year ago

A white-hat bug bounty hunter has been finding xss vulnerabilities in the cologne web-site. He first contacted the webmaster about three months ago. I've been working to deactivate the vulnerabilities.

I thought this issue might be helpful to future maintainers of the site. I'm not providing all details.

funderburkjim commented 1 year ago

key cleansing

The php display programs reference user inputs via 'post' variables, e.g. program.php?key=deva&dict=mw is processed as $key = $_REQUEST['key']; $dict = $_GET['dict']; etc. In this example, $key would have value 'deva' and $dict would have value 'mw'.

One way a cross-side-scripting intrusion could be started is to put, instead of 'deva', a string containing well-selected Javascript/html code. e.g.

<img src=a onerror=prompt('xss')>

If then some other part of program.php outputs this code of Javascript, e.g., with echo($key);, then that Javascript code can be executed. The sample (<img...>) is not dangerous, but will generate an error dialogue in the user's browser: image

Presumably, other more intrusive samples can allow undesired access to the server.

In the Cologne display php modules, the 'key' variable is output by php (it is the search term). The other variable values such as the value of 'dict', are never echoed so they are not vulnerable to this kind of xss.

But for 'key', the code needs to be modified.

funderburkjim commented 1 year ago

cleansing 'key'

The basic strategy that seems to work, at least thus far, is to 'cleanse' the value of 'key'. There seems to be no completely foolproof way to do this. The commit above, when applied to url for lrv

https://www.sanskrit-lexicon.uni-koeln.de/scans/LRVScan/2022/web/webtc/getword.php?key=%22%3E%3Cimg%20src%3Da%20onerror%3Dprompt(/xss/)%3E

adds an ugly string to the html output, rather than inserting the 'img' element. image

activate the new code for all dictionaries

This is applied to all dictionaries via the redo_cologne_all.sh script in csl-websanlexicon/v02/.

funderburkjim commented 1 year ago

thoughts on more robust code changes

Only a few php modules need to be changed. But there is currently some code duplication in the fixes. E.g., the same function-name init_input_keys is used in both indexcaller.php and in parm.php. There may be other modules that also need this revision. It would be better to have the 'cleansing' code in one php module, and to be sure that all other modules get the value for 'key' via this cleansing module.

It is also likely that this cleansing code could be written in a more robust

funderburkjim commented 1 year ago

an old version of AE.

https://sanskrit-lexicon.uni-koeln.de/scans/AEScan/index.php provides a 'scanned edition' of Apte English-Sanskrit dictionari.

This display is not currently linked on the home page. How did the white-hat find this?

The xss vulnerability was illustrated via:

https://sanskrit-lexicon.uni-koeln.de/scans/AEScan/index.php?sfx=%22%3E%3Cframe+src=%22javascript:alert(%27xss%27)%22/%3E`

OR, more easily read, with post parameter 'sfx'

sfx=><frame+src="javascript:alert('xss')"/>

The code was written to use either jpg image files or pdf image files for AE, and the 'sfx' parameter was to allow the user to choose the image file format. When one of the words in left-hand frame is clicked, the appropriate image is displayed in right-hand frame.

This is modified in two ways:

Now the above 'xss' vulnerability is gone.

This code is only on Cologne web site, not in any repository.

gasyoun commented 1 year ago

ugly string to the html output, rather than inserting the 'img' element.

Ugly indeed

Now the above 'xss' vulnerability is gone.

Many more left?

funderburkjim commented 1 year ago

correction

The AEScan IS linked on Cologne home page (but not in local xammp home page).

other scanned image displays

All the other scanned displays from the home page have been revised, and should now be safe from the kind of xss intrusion mentioned above.

funderburkjim commented 1 year ago

There are 'semi-digitized' displays for WIL and STC.
These use Perl code. For stc, the Perl code has been converted to PHP, and the display now works (and is safer).

A similar conversion remains to be done for the Wilson display.

There is also a deprecated display Sanskrit and Tamil Dictionaries, 2005 based on Perl code. This remains to be done.

funderburkjim commented 1 year ago

Revised https://www.sanskrit-lexicon.uni-koeln.de/tamildictionaries/scanned_edition/index.html (and related php functions) to be more resistant to xss problem. Note: this code not currently in a repository.

funderburkjim commented 1 year ago

Revised https://www.sanskrit-lexicon.uni-koeln.de/scans/WILScan/web/index.php. This formerly used Perl code and a mysql database. It is now written completely in php, and uses only text files. Changes also made so this will be less susceptible to xss mischief.

gasyoun commented 1 year ago

This formerly used Perl code and a mysql database.

But PERL not fully eliminated on the whole of the server, right?

funderburkjim commented 1 year ago

Soon, no displays on the Cologne server will depend on PERL. I am in process of converting 'Sanskrit and Tamil Dictionaries, 2005' from Perl to PHP.

funderburkjim commented 1 year ago

attempted xss in csl-corrections

Someone entered "><img src=a onerror=prompt(1)> into all the fields of the 'standard' Correction form. May have been our white-hat bug bounty hunter Such a form causes no damage, but must be deleted from csl.tsv before processing corrections.

gasyoun commented 1 year ago

May have been our white-hat bug bounty hunter

I believe so.

funderburkjim commented 1 year ago

accent bug

In the basic, list, etc. displays, the xss safety change was not done properly for the 'show/hide accent' option. This bug corrected by above commit.