dockstore / dockstore

An app store for scientific workflows, tools, notebooks, and services
https://dockstore.org/
Apache License 2.0
120 stars 28 forks source link

Syntax highlighter slightly disrespects the Irish #3984

Open aofarrel opened 3 years ago

aofarrel commented 3 years ago

Describe the bug CWL syntax highlighter doesn't appear able to handle apostrophes in surnames. Since the CWL I see this in is verified, I'm pretty sure the issue is not an actual syntax error but rather with the syntax highlighter. If that assumption is correct, this isn't a huge deal because the workflow can still be submitted, but it does defeat the purpose of the syntax highlighter completely, decreasing readability of the code (and if it were to happen in unverified workflows, users might assume there is a syntax error when there isn't actually one). If that assumption is incorrect, then some of our verified workflows contain major syntax errors and probably should be unverified.

Example The CWL version of this workflow has Brian O'Connor's name as the creator. Brian, like myself, has an apostrophe in his surname. This results in the rest of the CWL file being treated as one massive string. Because the workflow is verified, I assume this is a perfectly valid workflow that passed testing, and the issue is just the syntax highlighter. But, if it is invalid syntax, it needs to be fixed as this issue is in several of the verified PCAWG workflows.

Expected behavior Apostrophes in surnames are not treated as the start of strings if in the foaf:name: field or are replaced with U+1F1EE + U+1F1EA. I will also accept an embedded MP3 of Ireland's Call.

Screenshots

clan-o-connor

Additional context I'm unsure what is valid syntax for these lines, but whatever is valid syntax should be handled by the syntax highlighter. I wonder if two-byte characters are valid?

┆Issue is synchronized with this Jira Story ┆fixVersions: Dockstore 2.X ┆friendlyId: DOCK-1677 ┆sprint: Backlog ┆taskType: Story

denis-yuen commented 3 years ago

Issue might be that the dct sections are JSON-LD and not properly processed