bhall2001 / fastjson

A Livecode library for JSON encoding and decoding to and from Arrays.
MIT License
17 stars 11 forks source link

"infinity" being detected as a number #7

Closed bhall2001 closed 8 years ago

bhall2001 commented 8 years ago

JSONlint report mal formed data. Found the issue that a dog name "Infinity" is having their name converted to a "number" called infinity. Interesting...

madpink commented 8 years ago

are you interacting with another program, or database?

I was actually under the impression that values of infinity default to null with json, so I am curious as to what the cause is

bhall2001 commented 8 years ago

The culprit is Livecode. Open up message box and type

put "infinity" is a number

When the parser is building out the JSON the LC script asks if the current token is a number. Turns out "infinity" is a number to Livecode.

As an asside, type in the message box put 081105004 is a number

LC says true. JSON says false. I'm up in the air...

madpink commented 8 years ago

That is interesting, I played around in Livecode, and it does keep the leading 0 in place as a number until you perform some sort of action on it (e.g. add 1 to tNum)

I would make a number with a leading zero a string.

bhall2001 commented 8 years ago

here's some more fun I just figured out (not in LC dictionary). All the strings below return "true" to isNumber

zero, one, two, three, four, five, six, seven, eight, nine, ten

This actually has implications for the existing libraries both easyJson and libJson use isNumber and I know for sure that libJson would not return all the above as an un-quoted word as the data element IF these words are the only word of the data element.

madpink commented 8 years ago

that is an excellent point, it's almost as if it would be necessary to scan every character to verify that it is an actual number (or element that could make up a number)

bhall2001 commented 8 years ago

Well, yes you can and in fact that is what easyJson does. BUT THAT IS VERY SLOW IN LC as it is not a compiled language and as the Json string length increases, I believe it's a close to linear degradation in speed. I'm about to post the fix I've come up with. It adds in about 20ms on my 94k test file...

bhall2001 commented 8 years ago

And incase others stumble upon this thread, here's a REAL LIFE reason this is a concern. As it turns out, the sample data file I'm using is about dogs. There is a Dog in the data named Infinity. I'm not making that up. When I ran the resulting easyJson, libJson and fastJson .json file created through a linter, it reported I had a document that was not valid.

Intended Result:

          LC array:    Dogs[Name]="Infinity"
      becomes JSON:    {"Dogs":[{"Name":"Infinity"}]}

However, when the logic in Livecode says "if aValue is a number" and aValue happens to be "Infinity" (or zero, one, two...), the conditional returns true which results in:

              JSON:    {"Dogs":[{"Name":Infinity}]}

Notice no quotes around Infinity as Livecode reported that this is a number.

The code in the Json lib sends back the "number" Infinity as the value for the Json string. Don't worry, fastJson has it covered now ;-)

Thanks Infinity! I owe you a cookie next time I see you!!!!!

madpink commented 8 years ago

Thinking out loud... if you do a find and replace of these characters: N,T,F,S, and then ran isNumber, then any constant should be identified as a string instead of number, but it shouldn't disrupt anything that is actually a number

bhall2001 commented 8 years ago

Here's how I fixed it with minimal time impact. I still check if pValue is a number. Then I check if char 2 of pValue is a number of all numbers who length is >1. If char 2 is a number, then it's a number. If char 2 is not a number, then it's a text string. Still fast and as far as I can tell, it works great! QED ;-)

bhall2001 commented 8 years ago

Also, as I don't honestly know what all the text strings that could be numbers are, this is I think pretty robust to words I don't know about.

madpink commented 8 years ago

the only ones I've seen listed are zero to ten and pi... however you've established infinity is a number and that isn't even in the dictionary

there is one more scenario worth noting... 2e+10 would be considered a number in both JSON and Livecode

testing char 2 would reveal it to be a string

bhall2001 commented 8 years ago

Oh what a night I've had with this ;-) I did find a bug in livecode http://quality.runrev.com/show_bug.cgi?id=16162 and I ended up solving this issue with brute force for now. It's not elegant but it's the fastest solution I can come up with. As soon as I introduce any kind of comparisions of chars or regex, the times go out the window. Look for an update to the branch shortly.

madpink commented 8 years ago

could you share part of the original data you used?

I've just been converting back and forth with JSONs using v0.2 and as long as the original was in quotes, it remains a string even if it is one of the constants.

I'm wondering if the problem is how you brought the data into the array in the first place...

bhall2001 commented 8 years ago

The problem I was having is I was focused on "Infinity" and was not looking at the other constants. Couldn't make it work with Infinity so I made the inference that all constants are broken. They are not. As long as you use is a number the constants are treated as a string, except for Infinity, which just doesn't work. I'll put my sample .json in the repository shortly. I need to go get a bite to eat before I pass out...

bhall2001 commented 8 years ago

Here's how I'm catching the "anomaly" in Livecode when one would like to use the word "Infinity" as a data element. if char 1 of pValue is "i" then return quote & pValue & quote else return pValue

bhall2001 commented 8 years ago

Here is what is happening with Infinity put ten is a number (true) put "ten" is a number (false) <-- ten is a string when quotes are around it put infinity is a number (true) put "infinity" is a number (true) <-- LC does not recognize the word "infinity" as a string. It can only be a number.