Closed omus closed 4 months ago
Back when I started on this package, I have a vague memory of JSON.parse() not working well with IOBuffer. I think maybe it had something to do with carriage returns in comment fields. If I remember right, that's why all the tests use strings for JSON data. It could be all those issues have been fixed by now.
I'll want to experiment a bit with this and make sure there are no gotchas.
@omus I don't think this is going to work because of problems with JSON. Here's an example
julia> using JSON
julia> JSON_string_noCR= """ {"annotations": [
{
"annotationDate" : "2024-05-24T17:14:55Z",
"annotationType" : "REVIEW",
"annotator" : "Person: Jane Doe (nowhere@loopback.com)",
"comment" : "This is a comment"
}
]}"""
julia> JSON_string= """ {"annotations": [
{
"annotationDate" : "2024-05-24T17:14:55Z",
"annotationType" : "REVIEW",
"annotator" : "Person: Jane Doe (nowhere@loopback.com)",
"comment" : "This is a comment\nContinued"
}
]}"""
julia> IOBuf_noCR= IOBuffer("""{"annotations": [
{
"annotationDate" : "2024-05-24T17:14:55Z",
"annotationType" : "REVIEW",
"annotator" : "Person: Jane Doe (nowhere@loopback.com)",
"comment" : "This is a comment"
}
]}""")
julia> IOBuf= IOBuffer("""{"annotations": [
{
"annotationDate" : "2024-05-24T17:14:55Z",
"annotationType" : "REVIEW",
"annotator" : "Person: Jane Doe (nowhere@loopback.com)",
"comment" : "This is a comment\nContinued"
}
]}""")
# These parse just fine
julia> JSON.parse(JSON_string_noCR)
julia> JSON.parse(IOBuf_noCR)
# But these error
julia> JSON.parse(JSON_string)
ERROR: ASCII control character in string
Line: 5
Around: ...a comment Continued" } ]}...
^
Stacktrace:
julia> JSON.parse(IOBuf_noCR)
ERROR: Unexpected end of input
...when parsing byte with value '0'
Stacktrace:
Using JSON.parsefile does not have this problem. Do you have any ideas why this happens?
The issue with JSON.jl and control characters occurs when parsing from a string, IO, or using parsefile
:
julia> using JSON
julia> JSON_string= """ {"annotations": [
{
"annotationDate" : "2024-05-24T17:14:55Z",
"annotationType" : "REVIEW",
"annotator" : "Person: Jane Doe (nowhere@loopback.com)",
"comment" : "This is a comment\nContinued"
}
]}""";
julia> IOBuf= IOBuffer(JSON_string);
julia> JSON.parse(JSON_string);
ERROR: ASCII control character in string
Line: 5
Around: ...a comment Continued" } ]}...
^
...
julia> JSON.parse(seekstart(IOBuf));
ERROR: ASCII control character in string
...when parsing byte with value '67'
...
julia> file = tempname() * ".json";
julia> write(file, JSON_string)
236
julia> JSON.parsefile(file);
ERROR: ASCII control character in string
Line: 5
Around: ...a comment Continued" } ]}...
^
...
I'll mention that the error you showed in your comment (Unexpected end of input
) was due to having already read to the end of the IOBuffer
which is why I added seekstart
into my example above:
julia> JSON.parse(IOBuf_noCR);
ERROR: Unexpected end of input
The JSON spec doesn't seem to support actual control characters such as newline. So a JSON string \n
in Julia would be displayed on the REPL as \\n
. So for JSON.jl you can use JSON.json
to perform the escaping for you:
julia> JSON.json("\n")
"\"\\n\""
Are there any other issues you can see with this change?
I'm understanding better what's going on here. In my package PkgToSoftwareBOM.jl I do generate strings with a carriage return in them. And when I write them to file with JSON.print(), it auto-converts those from a control character to the character literals "\" and "n". Then when reading the file, JSON.parse sees the characters together and combines them into a carriage return.
Ran a read/write test with a complicated SPDX file in PkgToSoftwareBOM/examples. The functions work. I'll merge soon.
Released as v0.4.1
Add
IO
methods forreadspdx
andwritespdx
to support working with in memory data.