Closed sebastien closed 5 years ago
This is now partially implemented in the master branch. For now, I am still using {"typename": value}, and not the {"type":"typename","value":value} scheme that Sebastien prefers.
The top level objects are:
{"print" : string}
{"warning" : exception}
{"value" : value}
{"shape" : shape}
{"error" : exception}
The json-api output consists of zero or more print or warning objects (representing debug output that is supposed to be printed on the console), followed by a final object which is one of value, shape or error.
An exception is { "message":string, "location":location_array }
, where "location" is optional.
A location_array is an array of one or more location.
A location is {"byte_range":[int,int], "filename":string}
. The byte_range is a zero-indexed, half-open range of byte indexes into the source file. The filename is optional. TODO: add start and end line and column numbers.
TODO: Sebastien described "context": <an array of the last N lines up to the line that has an error>
which would belong in the location object. The description is ambiguous.
A shape is {"shader":string, "is_2d":bool, "is_3d":bool, "bbox": [[xmin,ymin,zmin],[xmax,ymax,zmax]] }
. TODO: Describe picker GUI controls and uniform variables. TODO: Currently, the shader string is a shadertoy compatible shader, which means there is some boilerplate that needs to be added to the beginning and end to make it valid GLSL. Maybe the full shader should be provided.
I just tried it and I think you'll need to encode the \n
as \\n
in the shape.shader
field. It's important that the JSON payloads be \n
-separated as it will allow for streaming of messages to the client -- and the \n
should never be found in the JSON payload itself.
I think we can drop the error.context
attribute, the location is going to be enough to retrieve the context from the source. Also, I think that actually the character range would be better than the byte_range
, I got my wires crossed!. If you only have the byte_range
it's OK as well, but will take an extra encoding/decoding step to retrieve the substrings.
It would be nice to have a standalone GLSL shader, but the boilterplate code required to run it is not problematic, so it's good like that for now.
Not sure if it's the right place to discuss this, but I think I'd like to have the parsed AST exported as JSON or S-Expr. It's going to be useful to extract symbols from libraries and code in the short term, and in the longterm I want to edit the AST directly online and stream tree patches to the compiler.
I will fix the newline problem.
By "character range", do you mean "a pair of Unicode code point counts", or do you mean "a pair of <line-number,column> indexes"? I can provide either one, or both. Note that Curv source files are currently restricted to ASCII, so every code point is a byte right now.
There are some big design issues around the AST requirements, which we need to discuss outside the context of JSON-API.
By character range I mean the index of the glyph in a unicode string, basically so that I can do (in JavaScript) source.substring(range[0], range[1])
, but now that you mention it, I think it would be good to have the line number as well:
{
start:{char:<glyph index>, line:<glyph line number>, column:<glyph column number>},
end:{...}
}
This should provide all the info to highlight the error in the UI and would save implementing a code source search for the fragment.
I'll open a ticket for the AST, or do you prefer to discuss it by email or on the forum?
In Python, a character index is a Unicode Code Point index. In Javascript, a character index is a UTF-16 Code Unit index. The index values are different if the string contains emojis. Swift uses yet another definition, Extended Grapheme Cluster index, where the index values depend on tables which change with each new Unicode release, so client and server have to agree on a Unicode version. This is just one of about 1000 reasons why I don't support Unicode right now. But yes, I will use your design, except I will use 'byte' instead of 'char', since character indexes are different in every programming language. And byte indexes will work as character indexes since Curv strings are restricted to ASCII.
I don't want to spam the forum too much with this discussion, so let's use a github issue.
I redesigned the location object. Here is the output of curv -ojson-api -x '2+true'
, after being piped through jq
to format it:
{
"error": {
"message": "2 + true: domain error",
"location": [
{
"start": {
"char": 0,
"line_begin": 0,
"line": 0,
"column": 0
},
"end": {
"char": 6,
"line": 0,
"column": 6
}
}
]
}
}
start.line_begin
is the byte/character offset of the first character of the first line.start.char
and end.char
are a half-open interval: end.char
is the index of the first character after the text span that we are indexing. Therefore, if the text span has zero length, then start.char==end.char
.end.column
is the index of the first character past the end of the text span. No problem for ASCII/byte only, but then I think we should change "char"
to "byte"
to avoid confusion (I had a similar issue when my parser was giving byte-offset in a unicode string while the JS front-end was interpreting glyph/char offset).
Also, do you plan to use {type,value}
, or {<type>:value}
in the long term? I'm still working on the GUI infrastructure, but I'm going to start wiring the JSON API interop fairly soon and I'd like to avoid re-doing some code. I think the first format is more resilient, but I don't mind the other either (I might just say "I told you so" later on ;).
I'd like to continue using {"type":value}. Here's my reasoning.
In the future, we'll need a bidirectional communication protocol for sending messages between the kernel and the GUI, which could be based on JSON-RPC or ZeroMQ. Messages will have an envelope, with metadata like a session id and request id, plus a payload, which is data described by a tagged union: there is a tag (message type or method) plus type-specific data.
The JSON-API output format is a transitional technology that doesn't need to be complicated. All we need is a stream of payload objects, we don't need envelope metadata.
For JSON-API, I'd like to use the simplest design that works. I think that {"tag": data} is an elegant way to represent tagged data in JSON. The same representation could be embedded in a more general message envelope structure. I like my proposed scheme because:
.error.message
and .shape.shader
.jq
, elixir
, or curv
, then the patterns used to select data from json-api output are shorter and more convenient.If JSON-API survives into the long term, we can evolve it, adding record fields, adding new types and deprecating old ones, if the need arises.
I decided to use your suggestion of "char" rather than my original idea of "byte" because the fields "char", "line_begin" and "column" are all measured using the same units, and they are all character counts.
"I'm still working on the GUI infrastructure, but I'm going to start wiring the JSON API interop fairly soon"
That sounds great! Looking forward to seeing what your design looks like.
Thanks for the explanation on the {<type>:<value>}
format, I appreciate it, and agree with the rationale. For context, I was thinking along the same lines (preparing for encapsulating the messages) and wanted to separate meta-data (top-level) from data (value
level) so that we could extend the message format to have an envelope integrated directly in the meta data and avoid namespace clash with he data.
Now just one thing I'd like to clarify, because I'm a bit confused: does char
indicate a byte offset or a character (not as C char
but as JavaScript or Python char offset -- ie. unicode glyph offset)? If it's the former I really think we should use byte
instead and indicate that column is also in byte
in the documentation (although I think it should be in char
once you support UTF8). Alternatively, this could be {offset,column,line,unit}
(and maybe line_offset
instead of line_begin
) where unit=('bytes'|'char')
-- but that's a bit more verbose.
In any case, the end goal is to avoid confusion and be mindful that in JavaScript (the primary consumer of the data) string indexation is based on the unicode glyphs, not the offset within the byte representation.
The JSON-API is primarily intended to be used by high level languages like Python and Javascript, which do not support byte indexing into strings. The {char,line_begin,column} fields need to be character indexes that work in high level languages, not byte indexes.
So that's what I've done. If I were to use the word 'byte', then it would convey the false impression that these integers cannot be used as Javascript string indexes.
Now, it happens that Curv source files are restricted to ASCII, which means that byte indexes and character indexes are the same thing. The Unicode rant that I inserted into the previous message seems to have muddied the waters. I was reminding myself that extending Curv to support non-ASCII Unicode characters is tricky, because it could potentially break an interface like JSON-API that contains character offsets.
Perfect, it's much easier for me to work with chars, and it will definitely speed up the frontend-backend interaction.
I noticed that the JSON output includes Infinity
, which is not valid JSON (it works fine from Python, but not from Firefox or Chrome). Here's how to test: JSON.parse("Infinity")
Curv uses infinity a lot, but in json-api, infinities are supposed to be printed as 1e9999. If you have an example where infinity is printed as "inf" instead of as "1e9999", please post the full json-api output because I can't reproduce the bug.
I know you said "Infinity" but there is no code in curv that can print inf as "Infinity" as far as I know.
Oh, I see that Infinity is the name of a global variable in Javascript.
If I type 1e9999 into a Javascript REPL, then it prints Infinity in response.
Maybe if you read the JSON returned by curv, evaluate it to a Javascript value, then convert that to JSON text, then convert that text to a Javascript value a second time, then you will encounter this problem.
Oh, wait, it's actually Python -- I decode the JSON before re-encoding it on the webservice, which wrongly expands numbers to Infinity
, so more like a Python-related bug.
I've noticed some issues with the GLSL output. For instance with this one, Firefox gives me:
*** Error compiling shader: WARNING: 0:14: '/' : Divide by zero during constant folding
WARNING: 0:119: '/' : Divide by zero during constant folding
ERROR: 0:41: 'r26' : Loop index cannot be initialized with non-constant expression
ERROR: 0:78: 'r56' : Loop index cannot be initialized with non-constant expression
Is there a way for the compiler to predict which shape will compile to a working WebGL fragment shader? It would be neat to have a warning or an error from the compiler directly that explains why the shader does not work in WebGL (you mentioned something about loops in the group discussion).
I've noticed some issues with the GLSL output.
This may be a difference between desktop OpenGL and WebGL. The WebGL version of GLSL is more restricted. Or it may be that you are using WebGL 1, and the problems will go away if you switch to WebGL 2.
Loop index cannot be initialized with non-constant expression
This is a WebGL 1 restriction which is supposed to be lifted in WebGL 2. First thing to check is that you are creating a WebGL 2 context, not a WebGL 1 context, when you initialize OpenGL.
Divide by zero during constant folding
There is no Infinity constant in GLSL, so I simulate it by computing 1.0/0.0. This code works on the desktop using an OpenGL 3.2 core context. I would have expected it to work in WebGL 2 as well, but we need to run an experiment to verify that.
Is there a way for the compiler to predict which shape will compile to a working WebGL fragment shader? It would be neat to have a warning or an error from the compiler directly that explains why the shader does not work in WebGL.
if we are going to support both WebGL 1 and WebGL 2, then work is required in both the GL Compiler, and in your code. You would need to test the WebGL environment and determine if WebGL 1 or WebGL 2 is supported. You pass a command line flag to curv indicating the level of GL support. Then the Curv GL Compiler enforces compile time restrictions based on the GL support level.
If we are just supporting WebGL 2, then I suspect that the current GL Compiler output works in WebGL 2, and no extra work is required. I am currently targeting OpenGL 3.2 Core, released 2009. WebGL 2, released Jan 2017, is based on OpenGL ES 3.0, which in turn was released Aug 2012. OpenGL 4.3, released 2012, is the first version of desktop OpenGL that provides all of the features of OpenGL ES 3.0 (and is also a superset of ES 3.0). Based on these dates, I think OpenGL 3.2 GLSL code should work fine in WebGL 2. But OpenGL implementations tend to be buggy, so testing is always required.
https://stackoverflow.com/questions/51428435/how-to-determine-webgl-and-glsl-version
WebGL1 supports GLSL ES 1.0. WebGL2 supports both GLSL ES 1.0 and GLSL ES 3.0 period. The first line in a GLSL ES 3.0 shader must be
#version 300 es
So the first line of the shader must be #version 300 es
to avoid this problem.
And you must create the context using const gl = someCanvas.getContext("webgl2");
Thanks for the info -- I seems that the fragment shaders are a bit different between both version, I get that type of error:
238: void mainImage( out vec4 fragColour, in vec2 fragCoord )
...
287: }
288:
289: void main() {mainImage(gl_FragColor, gl_FragCoord.xy);}
290: *** Error compiling shader: ERROR:
0:289: 'gl_FragColor' : undeclared identifier ERROR:
0:289: 'mainImage' : no matching overloaded function found
I'll let you know once I've learned about the differences between WebGL1 and WebGL2. I have to work on camera interaction first before tackling that one!
I didn't give you the boilerplate for main(), and I haven't updated curv -ojson-api to output the boilerplate. The epilog code that I insert at the end is:
void main(void) {
mainImage(oFragColour, gl_FragCoord.st);
}
Which is specific to WebGL 2. The WebGL 1 code would use gl_FragColor as you have written.
My prolog code looks something like this:
#version 150
#define GLSLVIEWER 1
uniform vec2 iResolution;
out vec4 oFragColour;
uniform float iTime;
The main thing is that you need to define an out variable, which I call oFragColour, and you need to reference that same variable in main(). Or at least that's what works in OpenGL 3.2.
Thanks a lot, I also had to change the vertex shader a bit (include the version and change attribute position
to in position
), but it worked!
So the next step after that would be to get the parameters output in the json-api. I still have a good week and a half of work to do to fix the remaining issues until I move on to the parameterization UI.
I've tried Curv few months ago, but didn't work due to my old Intel graphics card. Today I came across https://github.com/floooh/sokol-tools/blob/master/docs/sokol-shdc.md and thought it might help with such issues as well as issues mentioned in this discussion.
@dumblob When you say "old Intel graphics card", what model of graphics card is it? If you still have the curv executable, what does 'curv --version' print? I'm wondering if your GPU really is too old to support Curv, or if there is some other problem that is fixable.
Thanks for the link to sokol-tools. I'm looking at several GPU middleware layers to fix my GPU problems. Right now I'm most excited about Google's Dawn library. https://dawn.googlesource.com/dawn
When you say "old Intel graphics card", what model of graphics card is it?
I've compiled curv right now just to test it again and this is the output:
curv> cube
3D shape 2×2×2
curv> GLFW error 0x10007: GLX: Failed to create context: GLXBadFBConfig
ABORT: GLFW create window failed
255$ curv --version
Curv: 0.4-260-g4c2f3826
Compiler: gcc 9.1.0
Kernel: Linux 4.19.67-1-lts x86_64
GPU: Intel Open Source Technology Center, Mesa DRI Mobile Intel® GM45 Express Chipset
OpenGL: 2.1 Mesa 19.1.5
(but I would be glad if it worked, of course :wink:)
@dumblob that's a legitimate error. It confirms that your Intel gm45 is from 2009 is too old to run Curv.
The JSON API feature is stable, so I'm closing the issue.
We've been discussing adding a JSON API to help integrating
curv
with an external interactive editing environment.We agreed that a simple JSON-based, standard I/O based communication protocol would be ideal to get started. This new output format would be enabled using the
-o json-api
command line option.In this protocol, each message is a JSON-encoded object terminated by
\n
. Each message has the following structure:Note: Doug original proposed to have
{"print":...}
,{"warning":...}
but the{type,value}
format makes it slightly easier to dispatch messages (it's a map of thetype
attribute) and leaves room for extension without conflict.Print & Warning messages
Error message
Value message
Shape message