Kitware / kwiver

Pulls Together Computer Vision Algorithms into Highly-Modular Run-Time Configurable Systems
Other
189 stars 83 forks source link

Invalid JSON serialization when NaN values are present #1816

Open 9cvele3 opened 9 months ago

9cvele3 commented 9 months ago

When using klv and serialize arrows to serialize klv metadata to json, sometimes I get invalid JSON that can not be parsed properly due to the following key value pair:
"value": NaN

NaN is not valid in JSON, it should be null instead:

"value": null

Short code snippet:

            std::stringstream ss;                                                                                                           
            kwiver::arrows::serialize::json::metadata_map_io_klv map_to_json;                                                               
            map_to_json.save(ss, vital_metadata_map);    
            auto str = ss.str();
aramSofthenge commented 8 months ago

Hi @chetnieter

I am encountering an "undefined reference" error with an example project that includes <arrows/serialize/json/klv/metadata_map_io.h> header, however kwiver build succeeds with arrows component enabled.

Any ideas on what I'm doing wrong?

Erotemic commented 8 months ago

2cents: even though NaN is not valid JSON according to RFC 8259, it is not equivalent to null, and blindly converting it can cause downstream errors.

To illustrate this point, consider this Python list:

[1, 2, float('nan'), None]

This list is distinct from and should not be considered the same as [1, 2, None, None].

In fact many json parsers support (including Python's internal one and ultrajson) these non-standard symbols such as NaN and Infinity, which parse to valid floats as represented by IEEE 754, but are simply unrepresentable in "authoritative JSON". Other libraries like ijson do take the opposite approach and explicitly disallow them, which IMO is unfortunate.

If your desire is to replace NaN with null, I think that should be an application specific decision and done via pre or post processing. Or just use one of the many json parsers that recognize the weakness of RFC 8259 and support NaN and Infinity.