hed-standard / hed-specification

Specification documents for HED (Hierarchical Event Descriptors)
https://hed-specification.readthedocs.io/en/latest/index.html
Creative Commons Attribution 4.0 International
8 stars 11 forks source link

First pass at rectifying the UTF-8 and value classes #571

Closed VisLab closed 8 months ago

VisLab commented 8 months ago

@IanCa @happy5214 @monique2208 The conversion to support UTF-8 is more complex than I had thought and requirements about allowed characters appear throughout the spec. I am sure that I didn't address them all. Can each of you review the changes that I am proposing and let me know of problems? I'd like to get this support released for the language schema, but it is more complicated than I thought. Thx

VisLab commented 8 months ago

Yes... what would you suggest for naming.

On Thu, Mar 21, 2024 at 4:41 PM IanCa @.***> wrote:

@.**** commented on this pull request.

In docs/source/02_Terminology.md https://github.com/hed-standard/hed-specification/pull/571#discussion_r1534717694 :

A contiguous portion of the data recording during which some aspect of the experiment is fixed or noted. + +## 2.2 Character sets and restrictions + +Starting with HED standard schema versions 8.3.0 and above, HED will allow UTF-8 characters in various settings. +The types of characters referred to in this specification are: + +| Name | Description | +| ---- | ----------- | +| ascii | utf-8 codes 0 to 127 (single byte) | +| non-ascii | utf-8 codes greater than 128 (multi-byte) |

So below 32, the only relevant characters I see are \t, \r, and \n. Do we want to just remove the ascii designation and instead specify which < 32 characters are allowed manually when relevant?(as newline already does)

— Reply to this email directly, view it on GitHub https://github.com/hed-standard/hed-specification/pull/571#discussion_r1534717694, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJCJOX4ERDRLXWITB3JZRDYZNHXJAVCNFSM6AAAAABE6BAIE6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNJTGQ4DANJSHE . You are receiving this because you authored the thread.Message ID: @.***>