Open JimClarke5 opened 2 years ago
I think DL4J also has some parsing logic for the JSON emitted by Keras, maybe we could look there to see what hacks they did? I'm surprised TF Python is emitting malformed JSON, is there any comment in the code about it or open issue on their repo?
They do seem to do some custom stuff yes, see https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/saving/saved_model/json_utils.py
Yes, I have seen this and it is pretty straight forward to implement these customizations in GSON.
However, what seems to be stored in the SavedModel is not pure JSON, but a string version of a Python dict
.
It is close to JSON, but not 100%.
The following Python code demonstrates this:
a = np.array([[0.2, 0.2], [0.1, 0.3]], dtype="float32")
m = { "array": a}
print(m)
-->
{'array': array([[0.2, 0.2],
[0.1, 0.3]], dtype=float32)}
Is there a way to translate a Java Map
to a Python dict
in both directions, considering the ndarray issue above?
TF Python serializes numpy style arrays (e.g. Java NdArray) in the format
'normalizer': array([[0.2, 0.2], [0.1, 0.3]], dtype=float32)'
. This is not standard JSON, and tools, like GSON, throw aMalformedJsonException
when trying to parse this.The issue is how to best handle this in TF Java.
I have looked at GSON using customized
TypeAdapter
andSerilaizers
/Deserializers
, but I cannot get past the low level parsing throwing theMalformedJsonException
. Right now I think my only alternative is to write a pre-parser to convert thearray()
format string to a wellformed JSON string. For example, for the above example:_ArraySize_
is the array's shape and_ArrayType_
is the datatype for the NdArray. This format is taken from OpenJDataAlso, once we settle on a way to do this, should the NdArray package be modified to include serialize/deserialize methods. What about compatibility with TF Python?
Any suggestions?