SerpicoProject / Serpico

SimplE RePort wrIting and COllaboration tool
Other
1.09k stars 369 forks source link

Can't upload a .docx template in Greek language #577

Open nikosev opened 5 years ago

nikosev commented 5 years ago

Please fill out the Bug Form or Feature Request Below


Bug

Describe the issue and steps to reproduce

  1. Go to admin/templates/add and try to upload a template (.docx file) in Greek language
  2. Get the following error message
    Error with a π character : character without pair between :
    ., To, SQL, SQL, Injections, ). Μέσω αυτής της ευπάθειας είναι δυνατή η απόκτηση πρόσβασης στην βάση δεδομένων της εφαρμογής, η τροποποίηση, η διαγραφή και/ή προσθήκη νέων εγγραφών στην βάση δεδομένων καθώς επίσης και η εγγραφή αρχείων στο σύστημα αρχείων του εξυπηρετητή., Ευπάθειες που Αναγνωρίστηκαν, SQL Injection, Σοβαρότητα κινδύνου:, ΥΨΗΛΗ,
    and
    ΝΑΙ, Περιγραφή, τ, SQL, Injection, SQL, Β, Δ, εδομένων με ανασφαλή τρόπο. Κάποιος επιτιθέμενος μπορεί να εισάγει κατάλληλα δεδομένα εισόδου που τροποποιούν την δομή του ερωτήματος, ., 

Feature Request

Describe the feature and give an example use case:

As I can see from the source code the "metacharacters" are:

metacharacters = document.enum_for(:scan,/Ω|§|¬|π|æ|∞|†|µ|ƒ|÷|å|≠|∆|¥|ツ|⁂|<\/w:tr>/).map { |b| [Regexp.last_match.begin(0),b] }

and my document contains the "π" letter in many words and that's why I can't upload the template. Other letters that could be affected are: Ω(for Greek alphabet) and maybe å (if it is used in other alphabet). I would suggest to replace these special characters with tags(the name of the tag would describe it's functionality), so the code would be more human readable.

In other issues, I have noticed that you are not a big fan of this kind of changes, because all the custom templates will break. So, do you have any other workaround to suggest? Is it possible to change the encoding of these letters to bypass the checks?

The last option would be to replace these letters locally to resolve my use case, but we should consider that maybe there are or will be other users that need to create a report in Greek.

EDIT: the μ letter is not affected. The encoding of the letters μ and Δ are not matched with the Greek alphabet.

BuffaloWill commented 4 years ago

@nikosev I appreciate your detailed effort on this one. Thank you for mapping out the problem characters. I knew we would run into issues at some point but wasn't sure which characters would cause us problems.

So, do you have any other workaround to suggest? Is it possible to change the encoding of these letters to bypass the checks?

It's ugly but performing character replacements is the best I can think of without programming (if you want to program see below). For example, let's say Greek language users could use ל instead of π in their documents as a metacharacter.

Run a macro when the document template is saved and perform the following actions:

  1. swap instances of π with ץ
  2. swap instances of ל with π

The template could then be uploaded. After the report is generated you would need to do a character replacement of ץ to π. FWIW this is a terrible solution.

My other suggestion would be to code a plugin or fork your own version of the project. If you fork the project, the solution should be quite simple; go to SERPICO_DIR/helpers/xslt_generation.rb and replace any instance of π with a meta-character of your choosing. This library does the handling of meta-characters and iirc you should not need to make any other changes to the project. You can update this issue if you run into other issues.

MaxNad commented 4 years ago

The validation and replacements for the templates are pretty much all done in the same section of the code. Changing the code to add a check like if char == π and previous_char != \ could allow to use escape character like without being such a significant change