VaccineAndDrugEvaluationCentre / rosewood-go

Reference implementation of RoseWood
0 stars 0 forks source link

Error in parsing underscore #21

Closed gurpreetpabla closed 5 years ago

gurpreetpabla commented 5 years ago

The tool is unable to parse PCT_. Please have a look at the test.txt and test.log here.

DrGo commented 5 years ago

what is the correct behaviour? Is this part of the Rosewood standard?

Righolt commented 5 years ago

It seems to be interpreted as an opening of an italic markdown without closing _.

The behaviour is referred to in https://vaccineanddrugevaluationcentre.github.io/info/references/rosewood-reference/rosewood-specs/#the-caption-section as markdown (with a closing) and option for escaping (which works with this). Not defined is what happens with a single _ (or any other markdown element) and whether start/end markdown needs to have a space around them (or can be in the middle of text.

DrGo commented 5 years ago

Thanks Christiaan, for some reason the last line of the log file was cutoff in the github viewer. One possibility is to change the behaviour so that imbalanced markdown tags are rendered as regular text. The trouble is that one imbalance in a long text with multiple markdown tags can then result in unpredictable formatting. I prefer the error message option, perhaps with some improvements. What does everyone think?

Righolt commented 5 years ago

I think most markdown renders just ignore it when there is an imbalance. (Not saying that that is ideal, but it seems that Markdown always renders and just ignores/prints stuff that is not valid syntax.)

DrGo commented 5 years ago

That is the standard browser behaviour of parsing invalid html markup following the robustness principal https://en.wikipedia.org/wiki/Robustness_principle. I personally think it is better to see a specific error (perhaps with a link to a more detailed help screen) than to end up with improperly rendered text (sometimes without even noticing it).

Righolt commented 5 years ago

Although I tend to agree to demand correctness, it is hard to argue to be stricter about Markdown interpretation than most Markdown processors. The real question is who our target audience is here, if we want non-programmers to use it, then error messages are a high barrier to entry (and people may not care so much about exact markup).

@VaccineAndDrugEvaluationCentre/analysts Please share your thoughts. (Don't hold back, because of wat @DrGo and me said before, we need all arguments on the table.)

gurpreetpabla commented 5 years ago

If I am first time user of rosewood, I will start with the easiest step i.e automatic table creation (rosewood_write_dataset feature). Now, the data have text fields with _ in the word like pat_id etc.. One way is to display error and force to change the data which could turn the potential customer off or the other way is to process it with this link, so he be more aware of the elements/features. The other thing is that if data is like V40*,V41*, although V41 will be displayed as italic, I think it should have similar message as above, so the user will know why V41 is italic and how to undo that if he doesn't intent to do that.

gurpreetpabla commented 5 years ago

Just to add my another incident: I have a variable in the descriptive table which has * in one of its value and it is coming from MCHP defined format rhal. I understand this can be escaped using \ and I need to create new format based on rhal. However, how about using \ or something else to define markdown elements like using \**bold**\ to create bold text.

Righolt commented 5 years ago

An alternative could be to have a table wide rule of no markdown.

DrGo commented 5 years ago

fixed.