Open KrzysFR opened 1 day ago
I'll answer all these questions in a follow up comment, but I'll start by saying that I'm almost done with a proper manual for FQL which may clarify a lot of these questions.
Why the distinction between int and uint?
It's modeled after Go's implementation of the tuple layer. Thanks for pointing this out. I will need to think about what I want to do with this. Go's tuple layer makes all the integers look like int64 or uint64.
How do you handle NaN or infinities?
I have not added support yet. This will be supported, most likely as the tokens like nan
, inf
, -inf
.
I find it difficult to parse a UUID that is not bounded by either {...} or any other character.
I haven't encountered any problems with parsing UUIDs myself. The context of the FQL query provided me with enough information to parse them without additional bounding characters.
Bytes: when parsing 0x1234, you first see the 0 which could the be start of a number, a uuid, or a byte sequence.
Ah, this gives me a clue as to why you had problems parsing UUIDs. I allow my parser to look ahead, which allows me to see that the 0x
is followed by hexadecimal digits, and hence I know it's a hex number.
Are hexa digits always upper? lower? either? the syntax definition is a bit ambiguous.
I currently use Go's standard library to parse the hex string, which allows for either. In the syntax definition I only allow for uppercase, but I'm planning on changing this. I prefer lower case myself.
For me, 0xFF and 0xFE is ambiguous, because they are the byte prefix for the System subspace and Directory Layer, which are a single byte, where here I think they would be encoded as 01 FF 00 and 01 FE 00 instead?
Yes, you are correct. (0xff)
would be packed as 0x01ff00
.
Also, FQL doesn't support reading/writing key-value outside of the directory layer. Therefore, it doesn't support reading/writing to the system subspace. I may change this in the future.
How do you handle escaping of unicode characters?
Unicode is not currently supported, only ASCII. I do plan to add unicode support in the future.
does (...) means "any tuple, empy or not?
Yes.
...is
("hello", 123, ..., <int>)
supported?
Not currently supported, though I may add support for this in the future.
What would we use here? nil seems weird because it is different (for me) than the concept of "empty".
nil
is what I'm using for empty values. This allows the empty value to logically mirror the empty element within a tuple: (nil)
. I don't plan to change this part, though I am glad you told me your opinion as it helps me see how other people's intuition works.
Version Stamps: maybe add a new stamp or versionstamp or vs type in variables?
Yes, version stamps will be supported sometime in the future. They are not supported yet.
Would it make sense to be able to impose constraints on types? Like a regex on a string, a range on a number, a maximum/minimum/exact size for string/bytes?
Yes, I've considered this. I may add this in the future.
What if I use partitions/sub-partitions ? This is a way to "lock" an application into a specific prefix (ie: if "/foo" is a partition with prefix 15 2A (== (42, )), all keys will have this prefix, even sub-directories of this partition.
I have not added support for partitions yet. I still need to look into this. When you set up an FQL instance, you must provide a root directory which would contain all queries to within that directory. I expect partition to work in a similar way.
Directory names are string, and I also use strings 99%+ of the time, but technically, the names can be any sequence of bytes...
I don't plan to support reading/writing all possible key-values right now. For the near future, I'm focused on supporting the 99% of use cases which only includes key-values encoded using a directory (made of strings) and a tuple. After I have this working, documented, and well tested, then I may implement support for other cases like this one.
This is similar to the question about system keys. Most user don't need to access these, so I will focus on the most common use cases first.
On top of querying, I see this as very useful to encoded the "schema" of a layer somehow, so that a UI could automatically decode data into an arbitrary subspace (using the optional Layer Id in directories).
Yes, this is one of the goals of the project.
named variables...
Yes, this is a feature which I will explain in the manual. The manual should be available within the next week or so.
Great idea! I've used tuples for a long time, and always wished there was a standard for representing them and querying them. Hopefully this could be the one :)
I apologize for the flood, they are notes I took when implementing a parser for fql in C#/.NET
I have a few remarks, coming from a long time user of tuples, mostly in the context of writing complex layers in .NET/C#, and not 100% familiar with go conventions.
int
anduint
? Is this a "go" thing, or is there an actual reason?-123
it is clear that it is not "uint", but what about123
? From my point of view it is both "int" and "uint"float
between 32-bit and 64-bit IEEE numbers?NaN
or infinities?{...}
or any other character.{xxxx-xxx...}
or"xxxx-xxxx-..."
. (or maybe it is a Windows thing?){...}
for uuids would make it non-ambiguous for the parser0x1234
, you first see the0
which could the be start of a number, a uuid, or a byte sequence.[ 1234 ]
or'\x12\x34'
(single quote, like it was in the python 2 tuple encoder) for bytes?0xFF
and0xFE
is ambiguous, because they are the byte prefix for the System subspace and Directory Layer, which are a single byte, where here I think they would be encoded as01 FF 00
and01 FE 00
instead?"こんにちは世界"
, high/low surrogates?\uXXXX
to encode any codepoint ?(...)
means "any tuple, empy or not? For ex, in(1, <int>, (...), <int>)
, does the middle part means "any tuple" ?("hello", 123, ..., <int>)
supported? This would help with "variable sized" tuples in some layers, where you still need to parse the last one or two parts of a tuple.There are a few types that I use frequently in tuples, and that are missing:
''
(two single quotes) to define them. What would we use here?nil
seems weird because it is different (for me) than the concept of "empy". Maybe add<empty>
or<none>
for values?stamp
orversionstamp
orvs
type in variables? ex:(1, <stamp>, ...)
0x32
and0x33
in the tuple encoding, followed by 10 or 12 bytes.Directory
(0xFE) andSystem
(0xFF) prefix, which are useful in tuples that have to query the system space, or inside the Directory Layer (each nested partition adds another 0xFE to the bytes).(0xFE, "hello")
which for me reads as "the key 'hello' in the top-level Directory Layer" and encoded asFE 02 h e l l o 00
, where I guess here it would be encoded as01 FE 00 02 h e l l o 00
which is not the same.xFF/metadataVersion
or other system keys? They usually don't use the tuple encoding for the keys.uint
vsint
distinction could be emulated with a "must be positive" or "must be exactly 64-bits" constraintRegarding directories:
15 2A
(==(42, )
), all keys will have this prefix, even sub-directories of this partition./foo/bar
, then ALL keys would start with/foo/bar/...
so in practice we represent them without the prefix, so something like.../my/dir
(similar to a webapp that could be hosted under any path)...
means "zero or more" in the tuples, maybe use./my/dir
or~/my/dir
to represent "from the root defined for this application" ?/foo/<string>/bar
or/foo/<uuid>/bar
?On top of querying, I see this as very useful to encoded the "schema" of a layer somehow, so that a UI could automatically decode data into an arbitrary subspace (using the optional Layer Id in directories).
For example, I used the following format to define the schema of a custom layer, like a typical "table with index + change log" mini layer:
(..., <metadata_key>) = <metadata_value>
(..., 0, <doc_id>) = <json_bytes>
(..., 1, <index_id>, <value>, <doc_id>) = ''
(..., 2, <index_id>, <value>) = <counter>
(..., 3, <version_stamp>, <doc_id>) = <change_event>
Legend:
...
means the prefix of the Directory where this layer is stored.<name>
was a placeholder for a type of data, but the type was not specifiedI think this could be adapted to use fql as the syntax, but this would required adding the support of named variables:
<foo:int>
or<int:foo>
would definefoo
to be a variable of typeint
<foo:any>
/<any:foo>
or<foo:>
/<:foo>
<uint32>
/<uint64>
?<uint:32>
/<uint:64>
?<uint,32>
/<uint,64>
?The above could become:
~/(<string:metadata_key>) = <any:metadata_value>
~/(0, <uint:index_id>, <uint:doc_id>) = <bytes:json>
~/(1, <uint:index_id>, <int|string|bytes:value>, <uint:doc_id>) = <empty>
~/(2, <uint:index_id>, <int|string|bytes:value>) = <uint64>
~/(3, <stamp:timestamp>, <uint:doc_id>) = <bytes:delta>