Closed Numpsy closed 1 year ago
Thanks for the contribution. Question: Should we use the generators to be compatible with AOT and remove the existing code?
I thought that being better for AOT was supposed to be one of the benefits of the source generator, but I haven't tested that myself
Sorry, maybe I didn't explain myself very well.
So my two questions are
As you already added the generator
[JsonSerializable(typeof(GrammarDefinition))]
Should we remove the KeepType stuff added by @hez2010?
Should we remove the Newtonsoft dep in the props file?
I think the answer to both is YES.
For (2), this change only changes the part that parsers grammers - there is another use of newtonsoft that does plists at
That's used to read the grammar file. We should migrate that too.
I had a small go a while back at replacing JsonTextReader
with Utf8JsonReader
and I think the small issue I had there was that the current APIs in the area all use Stream or StreamReader, whilst Utf8JsonReader operates on spans and sequences.
From what I recall, I think the situation was that as long as JsonPlistParser is given a StreamReader then it either needs to read the whole thing into an array to use as a span, or do something like https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-dom-utf8jsonreader-utf8jsonwriter?pivots=dotnet-7-0#read-from-a-stream-using-utf8jsonreader with chunks of data. (Maybe none of the files it's reading are big enough for it to be an issue though, just thought I'd query first)
Thanks for taking a look.
Yes, it's odd that the Utf8JsonReader
doesn't allow to read from a Stream
.
It seems that we need to do it manually ... Have a buffer and read chunks of bytes from the stream, let's say 2048 each time, and pass them to the Utf8JsonReader
as a span.
@Numpsy thanks for the effort in bringing System.Text.Json
to this project. Unfortunately, using the UtfJsonReader
to read from a stream is not trivial.
Probably we could go with the current changes, and let the JSONPListParser still use Newtonsoft, but I prefer not to rely on two different mechanisms (which involves downloading both NewtonSoft
+ System.Text.Json
from NuGet).
Probably the dotnet guys add a UtfJsonStreamReader
soon and we can revisit this stuff.
Hmm, that looks more complicated that the MS example :-(
Another possibility might be to see how it'd look using JsonDocument
instead of using the reader directly?
I gave a try to the MS example but I was unable to get it working.
I also gave a try to the StackOverflow solution which is not trivial. Also getting exceptions there.
So I'm not confident this is going to work fine unless the dotnet guys release a official UtfJsonStreamReader
.
In the other hand, I'm not sure if we can get the tokens from the JsonDocument
, but unfortunately, it requires loading the whole JSON grammar in memory to later build the IRawGrammar
which doesn't sound like a good idea.
That being said, if you give it a try and you find a solution for it, please let me know :-)
I made a draft that uses System.Text.Json
to read the JSONPListParser
. Now is loading the whole bytes in memory (but it's working without exceptions. Not too bad since the biggest grammar is ~655KB.
But this is not an acceptable solution. A requirement for this PR to be merged would be read in chunks. If someone wants to take a look into it, it would be appreciated!
I'll try to give it another look next week
I found this old commit from when I was last playing with things, just in case it's of interest: https://github.com/Numpsy/TextMateSharp/commit/15b2cef0a492ce002acd40c00f8c3979fd65b6b4
I had a first go at making it read the data in chunks... I don't know if it's the best approach, but the unit tests are passing for me at least.
Refs https://github.com/danipen/TextMateSharp/issues/34
This is just replacing Newtonsoft.Json with System.Text.Json - it doesn't contain any of changes to the trimming related functionality mentioned in #34
Fixes #34