Open chrivers opened 8 years ago
Ping @mrfishie @rjwut @IvanSanches @NoseyNick :)
Just chiming in with my 2 cents:
Hey nick, welcome to the new discussion :)
1) ..nope ;-)
The EngineeringConsole object contains exactly 24 objects (3 bytes), but has a bit mask of 4 bytes. There's just no way to know this.
You're right that we can calculate the minimum size, but that's not similar enough here.
Of course, we could add an "unknown" field with an "unknown" type just to force the bitfield to be 4 bytes, but that feels.. dirty :D
2) I agree that forcing people to implement JamCRC to use the docs would be needlessly complicated.
The current style is both an attempt at using the canonical names for something, as well as an attempt at balancing the use of symbolic names and endless "random" (looking) hex digits. I'm open for suggestions, certainly :)
3) :D
Yes! It's driving me up the wall too. The "0 or +1" style is actually used at least 2 places, and I think it would much cleaner just to give it a markup. For example, console_type is currently:
console_type: option<enum<u32, ConsoleType>>
if we made this:
console_type: nullable<enum<u32, ConsoleType>>
Then we have clearly marked which fields this goes for, as well as the size.
I agree that it be { 1. easier 2. more logical 3. less rage-inducing } if we could bind the encoding (u8/u32) to the enum type, but the protocol simply doesn't allow that. Tough noogies.
RE Combining Structs and Objects: Agreed, they are VERY similar, and in my perl I subclass/superclass the two. The big difference being the bitfields for optional later-bits. Would it be possible/practical/silly/good/bad/ugly to have something like:
struct ThingObj
_bitfield : sizedarray<u8, 3>
# the bitfield in this obj is 3 bytes long, for an object with 9-to-12 attributes
foo : optional<0, u32>
# attribute "foo" is an optional u32.
# Its presence/absence is indicated by bit 0 in the _bitfield
bar : optional<1, u8>
baz : optional<2, string>
...
wibble : optional<10, f32>
# attribute "wibble" is an optional f32, see bit 10 in the _bitfield
If this is a bit too artemis-specific, maybe a more generic
optional<condition, type>
where "condition" is a bit more of an expression, maybe "_bitfield&0x01", but maybe expressions are too language-specific :-/ Come to think of it, perhaps "parser" is also a struct with optional bits, except...
struct ServerParser
FrameType: enum<u32, FrameType>
shipSystemSync : optional<FrameType=shipSystemSync, ServerPacket::EngGridUpdate>
clientConsoles : optional<FrameType=clientConsoles, ServerPacket::ConsoleStatus>
...
[edited a few times to fix markdown syntax, and u32 enum, sorry]
The EngineeringConsole object contains exactly 24 objects (3 bytes), but has a bit mask of 4 bytes. There's just no way to know this.
Hmmm, I wonder why MINE has 32 attributes, matching the 4 bytes:
0x03 => ['EngCons',
BeamsHeat=>'f<', TorpsHeat=>'f<', SenseHeat=>'f<', ManuvHeat=>'f<',
ImpulHeat=>'f<', DriveHeat=>'f<', FShldHeat=>'f<', AShldHeat=>'f<',
BeamsEner=>'f<', TorpsEner=>'f<', SenseEner=>'f<', ManuvEner=>'f<',
ImpulEner=>'f<', DriveEner=>'f<', FShldEner=>'f<', AShldEner=>'f<',
BeamsCool=>'C', TorpsCool=>'C', SenseCool=>'C', ManuvCool=>'C',
ImpulCool=>'C', DriveCool=>'C', FShldCool=>'C', AShldCool=>'C',
BeamsUnkn=>'V', TorpsUnkn=>'V', SenseUnkn=>'V', ManuvUnkn=>'V',
ImpulUnkn=>'V', DriveUnkn=>'V', FShldUnkn=>'V', AShldUnkn=>'V',
],
Am I more up-to-date than the docs, or am I behind the docs, or were you referring to some other object, or... something else? [ never mind, I should read my own code before I paste. My last 8 are dummies, perhaps to make it 4 bytes. Checking my entire corpus, I find exactly ZERO instances of those beits being set / fields being sent :-( ]
@NoseyNick I agree, we definitely have to merge the syntax in some way :)
Actually, your example is not bad - and it's entirely valid within the current syntax! I don't personally like the optional<bit, type> syntax, I think it's a bit too verbose (and error prone), but it's entirely valid!
One of the big advantages right now, is that it's possible to verify a new layout of bit masks just by shuffling lines around, and regenerating the templates. You would lose that advantages, if you had to manually update all lines in the object :)
we could definitely express the parser that way! Just remember, it's important we don't call it a "struct", since it will then be nearly impossible to know that it's not the same "kind of thing". For example, it would then end up in the "structs" table in the documentation. Maybe something else than "parser" if people are not fond of that word, but not something we already use :)
Regarding the 4 bytes mask - that was one of the things I fixed in isolinear, that I didn't get around to making a HTML PR for yet. So if you read it in isolinear, it's because I fixed it there :)
This is going a bit off-topic, at least in places.
For questions purely about the Artemis protocol, please open an issue on isolinear:
https://github.com/chrivers/isolinear-chips/issues/new
For everything related to syntax, parsing and using stf, that belongs here no problem :)
btw, I'm currently working on turning the existing index.html into a template, to show how it could be done, and to serve as an example:
https://github.com/chrivers/protocol-docs/tree/transwarp
Not a huge amount of progress yet, but a little getting-started guide, and a few converting items so far :)
We definitely need to merge the syntax, but I'm not really a fan of optional<
/>
(what do we call these, by the way? functions?) - it will get very messy fast.
For syntax merging, perhaps some kind of 'or' syntax (YACC/BNF style) would be useful? Stealing YACC grammar(ish), here's an example:
ServerParser: FrameType::shipSystemSync(u32) ServerPacket::EngGridUpdate
| FrameType::clientConsoles(u32) ServerPacket::ConsoleStatus
...
Of course we'd need a better way to represent this syntactically (the syntax above isn't compatible with the current syntax and also probably isn't compatible with documentation), but this would definitely increase the flexibility of the language. Unfortunately it would also probably increase the complexity of resulting code (you'd need to effectively create an LR parser).
@mrfishie those are just parameters.
Type parameters have no pre-specified meaning, they simply describe a tree structure. Very much like an alternative syntax for S-expressions, actually.
I'm definitely a fan of BNF, but I think that's a little overkill in terms of complexity. Right now, we have a solution that might have rough edges, but it actually does work :)
Perhaps we are going about this the wrong way. Let's consider what kinds of information we want to store, and what goals we have. Once we agree on these, I think it will be very easy to finalize.
Data:
Goals:
What do you think about this? I feel we can get there with modifications to the current design. What do you think?
@mrfishie a clarification on types.
It really is just a compact tree structure, and nothing else. There's no special meaning attached to it. Perhaps there should be, so the compiler can prepare the data more for the templates?
For example, someone could write machine<spring, specialnumber<1337>, lever<color<red>>>
.. that's an extreme example, but this could then be used by the templates for whatever purpose.
Very much like an alternative syntax for S-expressions, actually.
S-expressions!!
Maybe we should just use those.. I think (magic (machine spring 1337) (lever (color red)))
is far more readable than magic<machine<spring, 1337>, lever<color<red>>>
, don't you?
/s
(Side note: your extreme example has an unmatched right angle bracket)
I'm definitely a fan of BNF, but I think that's a little overkill in terms of complexity. Right now, we have a solution that might have rough edges, but it actually does work :)
Yes, it is definitely overkill. I was just trying to get some discussion going on potential branching methods.
Perhaps there should be, so the compiler can prepare the data more for the templates?
Like some kind of set of inbuilt types that are processed by the compiler?
Very much like an alternative syntax for S-expressions, actually. S-expressions!!
Maybe we should just use those.. I think (magic (machine spring 1337) (lever (color red))) is far more readable than magic<machine<spring, 1337>, lever<color
>>, don't you? /s
:D
(Side note: your extreme example has an unmatched right angle bracket)
Oops! Thx, I fixed that now :)
I'm definitely a fan of BNF, but I think that's a little overkill in terms of complexity. Right now, we have a solution that might have rough edges, but it actually does work :) Yes, it is definitely overkill. I was just trying to get some discussion going on potential branching methods.
Definitely!
As I see it, we can basically go in 2 directions. Either we go all out, and make a full BNF description, and a real import system, type rules, etc.
Or, we find a more down-to-earth approach, and live with a more soft type system and validator. It will then be more up to the user to employ good practices, to keep the system in check.
Personally, I'm almost always a proponent of the first solution, but I honestly think it might be overkill here.
It is enticing, though. If we could define the meta-structure of object types in the language, and then declare object of that type, we could enforce order on the code.
For example, and this is purely a thought experiment:
## Declare primitive types, that can be references as-is without error.
$primitive i8, i16, i32, i64;
$primitive u8, u16, u32, u64;
# and so on...
## Declare a block type. Here we declare that enum always has a name, and never takes arguments
$declare enum $ident()
## Only one type of content: ident -> int literal mapping
$ident: $int
## Here we declare objects. They have a name, and must take a maskbytes=$int argument
$declare object $ident(maskbytes=$int)
## Objects are ident -> type mappings
$ident: $type
## Parsers take a read=$type argument
$declare parser $ident(read=$type)
## ..either int -> type:
$ident: $type
## ..or a referenced constant (like FrameType::foo) -> type:
$const: $type
Now, this is very, very rough of course. But that's the direction we could go in. It would be a very interesting tool, but it's certainly a completely different scope than the original idea :)
Perhaps there should be, so the compiler can prepare the data more for the templates? Like some kind of set of inbuilt types that are processed by the compiler?
I'm not a fan of built-in types, but they could be marked as being recognized, in the source file - like with the primitive declarations above. That way, the compiler could decide at compile-time if a reference is known or not. That would be a nice feature.
Otherwise, every spelling error just becomes a reference to a new "kind" of primitive, which of course doesn't exist (or even worse, does, but by coincidence)
(continued from https://github.com/artemis-nerds/protocol-docs/issues/50)
No, rather that I think we will only need 1 section syntax. For example, you can very clearly see the similarities between "struct" and "object". If we make the arguments optional (or find another way to represent the data), there is literally no difference.
Then, the only difference is the name. This is used so the templates can say "give me the
enum
named foo" or "give me allobject
definitions". The following would be valid:(ok, I might be tired, but I think you get the idea).
Since this is only markup, we can decide on a convention we like for the artemis protocol spec. Other projects might decide on other conventions, and so on. It's neither our duty nor place to make any such restrictions.
To clarify: This is NOT how the system works right now, but it's an idea I've been toying with from the beginning. I'm going to try it out.
True - it sticks out a bit. If we had free-form section names, we could name it something more appropriate. Like:
Yes, they're all collected into one big data structure, which is (at the moment) a tree of all defined sections.
Conceptually, this would be:
This extends all the way into the fields, but I think this is enough ascii-art to get the idea across :)
Sorry, "used" was a poor term here. It's where a human would read it, when writing that section ;-)
It's (textually) where it is referenced, so it made maintenance easier. But as I said, unless we plan on giving the files namespaces themselves (perhaps that's not a bad idea!), then it doesn't matter to the output.
I think there's very little (if any at all) to gain from taking Artemis-specific shortcuts. I'm not even sure what it would be?
I agree that if it constrains us from reaching a goal, we could revise the position. But I don't think that will be the case, since it works, today :)
Received!
Yeah, I allowed myself a little nerding out on the naming there, since the target audience is still the Artemis community.
Then again, except as an example, there are no ties from isolinear chips to the compiler, so I don't imagine people who are not looking for the artemis spec will bump into it, in the future.
Well, one simple rule could keep this controlled: All constants must come before all types. That would catch the majority of oops-my-typing kind of errors.
We really need some way to decorate the sections with non-body information, but here's another possibility:
This could work, too. I'm afraid it could get unwieldy though, and we lose the generality of it. It's going to be quite hard to parse this form in more than one line, and it could lead to excruciatingly long lines.
We also can't forego the names (like we do on types), since we want optional values. For example:
This would be a fairly clean way to add versioning information to sections.
But that's the point - right now there are no defined types!
The type parser literally does not care what you write, as long as it is within the syntax. This allows the templates (and thus, the end-user project) to come up with a type description they like.
To clarify - we certainly could add a list of standard type names (u8, u16, u32, f32, string, etc.. ) and then ban those, but it doesn't really solve the problem.
For example, "ConsoleStatus" could refer either to an enum, or to ServerPacket::ConsoleStatus.
We have to find a nice unambigous way to point to places in the namespace.
I agree that the current solution isn't optimal, but stripping away just "struct" seems odd, and quite arbitrary. It also in a very real way makes the templates more complicated to write. Either that, or we need to have opinions about what constitute "standard types", but I don't like that.
Agreed! The current solution is not perfect, but it's the least-bothersome one I could find on short notice :)
That's certainly an amicable goal - perhaps we should split the grammar and parsing portions into a separate project once we agree on a version 1.0 syntax.
Regarding the compiler, I can only say I was surprised by how long it took to go from working compiler (which didn't take long at all), to polished ready-to-run tool. I'm thrilled to see where we can take this next, and I hope we can all work to improve the syntax and the tools for everybody :)