Open GreyCat opened 7 years ago
2-space indent
It is understandable, but I prefer tabs. Unfortunately yaml explicitly disallows tabs :(. We also need an .editorconfig complying with the resulting guide in every repo.
Single implicit YAML document per file (i.e. no --- header) No %YAML x.y version directives, no %TAG directives
I have never used them because I thought about yaml as as a "prettier" (I think it is uglier, in fact, because it is much harder to deal with it than with JSON. In JSON I can write anything without taking care about spaces, etc, get a valid and working json, then apply a beautifier to get a pretty json) JSON. I have read a bit about yaml, and I have a question: why don't we use them, but use own bicycles like type:
? JSON compatibility?
UTF-8 encoding throughout the file
+1 for that
LF (AKA "UNIX") line endings
-1. git deals with it automatically.
trailing newline character in a .ksy file
-1. IMHO it's unneeded and ugly.
Block YAML style MUST be used in most general cases, unless specified otherwise
I don't understand what you mean. Clarification needed.
All identifiers, docstrings, comments and generally all human-readable text SHOULD be kept in English, unless there’s a very good reason not to do so
+1
Use the following order of sections:
I'd like to put doc-ref before ref.
Legal information - license - MUST be a valid SPDX license expression
-1. There can be licenses not in the list. Also some licenses mean nothing when referenced. So, I propose the following. If a license can be referenced, and if it is in the list, it should be in spdx format. If it isn't in the list, it will be in any format. For the case if the license cannot be referenced or a custom one,
license:
text: >
license text
should be used
Lines should be wrapped to be 80 columns long
-1. Editors wrap automatically and it's a pain to deal with manually-wrapped lines.
Other thought that there should be a tool for transforming yaml (even invalid one, for example with tabs instead of spaces) to meet the style-guide. This bot should process PR's automatically.
It is understandable, but I prefer tabs. Unfortunately yaml explicitly disallows tabs :(.
It's actually not a matter of preference, but embracing what's most popular and most likely to be used by users.
I have read a bit about yaml, and I have a question: why don't we use them, but use own bicycles like
type:
? JSON compatibility?
Because:
HashMap
/ dict
object.LF (AKA "UNIX") line endings
-1. git deals with it automatically.
Exactly. git deals with it automatically by converting possible CRLFs in user's working copy to LFs in repository. It's up to user how to check them out and how to work with them, but there should be LFs in repository.
trailing newline character in a .ksy file
-1. IMHO it's unneeded and ugly.
Again, that's a standard in most languages / editors. That's standard of git (git gives a warning if file has no trailing newline).
Block YAML style MUST be used in most general cases, unless specified otherwise
I don't understand what you mean. Clarification needed.
YAML specifies two forms of most literals: "block style" and "flow style". Block style is, for example:
- a
- b
- key1: value1
key2: value2
Flow style is:
[a, b, {key1: value1, key2: value2}]
This clause means that it's preferred to use block style instead of flow style in general (and we'll have special exceptions for contents
and some other cases, I guess).
I'd like to put doc-ref before ref.
I presume you mean "before doc
"?
In all of the languages where docstrings are generated that I've seen so far, documentation goes first and things like @see
, \sa
, etc, come in the end of it. Come to think of it, in any article, for example, text goes first and then we have a list of references in the end.
There can be licenses not in the list.
I can't think of any reasons why we should take risks and accept works with licenses not reviewed by SPDX / OSI.
Also some licenses mean nothing when referenced.
Any examples?
Lines should be wrapped to be 80 columns long
-1. Editors wrap automatically and it's a pain to deal with manually-wrapped lines.
Vast majority of language style guides (at least all target languages that we support) induce line length limits → people are mostly familiar with these limits and expect everyone to have them.
In general case, wrapping lines automatically is not a trivial task. For YAML, for example, I haven't seen any editor yet that does legible job at wrapping arbitrary YAML lines. For example, wrapping source code is very different from wrapping the text (and, AFAIK, no editors exist so far that support parsing of our expression language clauses).
Many other tools (git, GitHub, gists, emails) do not cope with long lines as well. Just take a look:
doc: Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah — is it really comfortable to scroll it all to read this?
Right now we just dump docstring lines as-is, and having them pre-wrapped in ksy helps getting decent docstring output, i.e.:
doc: |
Some decent wrapped multi-line
description of a type here.
doc: This string would be way too long to fit into any Java style guides / standards, because it extends past all possible and sane limits. Actually, it will apply to probably any modern language: even Go or Rust have their line limits.
/**
* Some decent wrapped multi-line
* description of a type here.
*/
/**
* This string would be way too long to fit into any Java style guides / standards, because it extends past all possible and sane limits. Actually, it will apply to probably any modern language: even Go or Rust have their line limits.
*/
Last, but not least, I'd really hate to invent yet another markup language for docstrings. Markdown (CommonMark or whatever) seems to be more or less accepted standard. In many cases (like C++ Doxygen) it could be just used verbatim, in other cases (like Java HTML-like docs, or C#) it should be easy enough to translate Markdown-like text into HTML subset-like docstrings. And Markdown routinely uses wrapped lines and empty line for paragraph breaks.
Any examples?
gnu gpl
and give all recipients a copy of this License along with the Program.
This means you must provide the license text. You cannot just say "this program is under GNU GPL".
Come to think of it, in any article, for example, text goes first and then we have a list of references in the end.
It is not a reference, instead it is the original source.
In general case, wrapping lines automatically is not a trivial task. For YAML, for example, I haven't seen any editor yet that does legible job at wrapping arbitrary YAML lines.
What do you mean? If you mean indentation, it is trivial to keep it, for example Notepad++ does this.
And Markdown routinely uses wrapped lines and empty line for paragraph breaks.
No. People do wrapping. Markdown is only a syntax, and this syntax allows wrapping. Though you are free to use non-wrapped lines and rely on editor to render them correctly.
This means you must provide the license text. You cannot just say "this program is under GNU GPL".
You can. Providing license texts is totally different matter, it doesn't mean that you need to copy-paste whole GPL text into every file. When you're distributing it, you can just bundle all license texts along, or, for example, in Linux packages, one can require something like common-licenses
and provide references to license texts already shipped there. That's actually what this is about — i.e. providing these references.
In general case, wrapping lines automatically is not a trivial task. For YAML, for example, I haven't seen any editor yet that does legible job at wrapping arbitrary YAML lines.
What do you mean? If you mean indentation, it is trivial to keep it, for example Notepad++ does this.
Just tried Notepad++ and I can't say that it really works well with YAML. At least for me, it frequently gets structure wrong, folding doesn't work properly and everything's constantly in red, suggesting an error, while there isn't any.
Here's how auto-wrapping "works" with code in Notepad++:
Good luck figuring out anything there. Docstrings wrapping is also terrible:
There is no way (beside syntax highlighting, which is quirky) to tell if it's legitimate new line or a continuation of previous one.
One idea: it would be good to have two commands: check-style
and fix-style
. The check-style
for checking the .yml for matching the style, and fix-style
for automatically fixing the style of the .yml file.
it would be good to have two commands: check-style and fix-style
Any volunteers to implement it? It's not very trivial, but I guess it's still possible with every YAML parser that exposes YAML lexical stream, i.e. anything libyaml-derived should do (Python, Ruby, SnakeYAML, etc).
Any volunteers to implement it?
It is an interesting task and I would like to see such feature but I don't have enough time to implement it for now. I think it can be a task with low priority. This feature is not required in general but would make the process more convenient in some cases.
It's not very trivial, but I guess it's still possible with every YAML parser that exposes YAML lexical stream, i.e. anything libyaml-derived should do (Python, Ruby, SnakeYAML, etc).
Can it be done in Scala? I assume that the mentioned above commands can be sub-commands (or options) of the kaitai-struct-compiler
, something like kaitai-struct-compiler check-style foo.ksy
.
Can it be done in Scala?
Sure it can, Scala is turing-complete. However, if I were to develop it, I'd rather choose something else. Embedding it into the compiler would be probably a bad choice, as compiler does not really deal with YAML parsing, it gets a ready-made object tree and then works off it. There are 2 different implementations of YAML parsers for ksc, and both are kind of "external" (i.e. SnakeYAML is JVM-only parser written in Java, which heavily relies on Java reflections, and it's already pretty awkward to use in Scala, and our current JS parser doesn't seem to have any concepts of interacting with event stream at all).
Good luck figuring out anything there.
Yaml disallows tabs indentation. You can enable whitespaces indication (View -> Show Symbol -> Show White Space and TAB
). I have it enabled constantly.
There is no way (beside syntax highlighting, which is quirky) to tell if it's legitimate new line or a continuation of previous one.
View -> Show Symbol -> Show Wrap Symbol
I've published major update for our style guide draft: see diff or whole guide rendered on doc site.
Please take a look and tell me what you think. There are several pretty bold assertions there (especially about enforcing particular order of keys and attribute naming guides).
Looks good at first sight.
I will try to create a KsyLint
component in JS (if it can be done in a few hours) and find out where my ksys violate the rules. I will comment if I find something major where I won't agree with the style guide.
Cool! Actually I've already started something like pretty-printer in Ruby, so just wait a little for me, ok? We'll see if that would work or not %)
Okay. I was thinking about integrating the checker into the WebIDE, so it could flag the problems for you while editing the ksy.
Well, it's not rocket science, we can always rewrite / adjust it later. Probably the most interesting result of this work would be machine-readable schemes of ksy ;)
Ok, here's a proof of concept: ksy_pretty_printer.rb
It obviously doesn't handle a lot (as our style guide does not handle a lot of complex cases, like switching types or fancy enums), but I've managed to test it with:
common/vlq_base128_be.ksy
— garbles long value
in instance, otherwise no changesmedia/id3v1_1.ksy
— fixes one style error and removes quotes from contents
I found the KSC-specific structural, naming, and organizational aspects of this style guide incredibly helpful in making my first substantial .ksy
file work well, so I hope we can set aside the evidently controversial text-format aspects of it and share it more widely.
Given that we've got pretty much .ksy accumulating in our formats repo, I guess it's a good idea to start shaping some style standard for them, to ensure collaboration and uniformness.
I've started such a document (source), but it still requires a lot of work and thought. I invite everyone to contribute and think of how we should present our formats.
@KOLANICH, what do you think?