reorx / httpstat

curl statistics made simple
MIT License
5.98k stars 384 forks source link

Could not decode json #8

Open XhmikosR opened 8 years ago

XhmikosR commented 8 years ago

Windows 7 64-bit, Python 2.7.12 64-bit

C:\Users\xmr\Desktop>curl --version
curl 7.50.1 (x86_64-pc-win32) libcurl/7.50.1 OpenSSL/1.0.2h zlib/1.2.8 WinIDN libssh2/1.7.0 nghttp2/1.13.0
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile SSPI Kerberos SPNEGO NTLM SSL libz TLS-SRP HTTP2

C:\Users\xmr\Desktop>httpstat google.com
←[38;5;248m←[0m
←[33mCould not decode json: Expecting property name: line 2 column 22 (char 24)←
[0m
curl result: 0 ←[38;5;248m{
"time_namelookup": 0,110,
"time_connect": 0,328,
"time_appconnect": 0,000,
"time_pretransfer": 0,328,
"time_redirect": 0,000,
"time_starttransfer": 0,391,
"time_total": 0,391,
"speed_download": 659,000,
"speed_upload": 0,000
}←[0m ←[38;5;248m←[0m
reorx commented 8 years ago

Try set LANG=C or set LANG=en_US, will that makes a difference?

XhmikosR commented 8 years ago

Nope, same thing.

reorx commented 8 years ago

The situation you encounter exposes two problems of httpstat in Windows.

  1. LC_ALL=C could not tell curl to use dot "." as decimal mark for float as it does in Unix
  2. Bash control codes for colorize output has no effect on windows also

The second one is easier to fix, I can disable color specially for windows. But for the first one, I need to know which environment variable in windows does the same thing as LC_ALL in Unix, and what's the proper value for it. Since I have no windows environment it hard for me to figure this out, I'll try to fix it but it may takes some time, PR is welcomed to help solving these problems in Windows.

XhmikosR commented 8 years ago

For the first issue, this might help http://stackoverflow.com/a/956084

You could test Windows by adding AppVeyor to the repo or something.

XhmikosR commented 8 years ago
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'Greek_Greece.1253'
>>>

I'm not experienced with Python, but it seems it only sees my current locale regardless of the env vars LANG and LC_ALL.

XhmikosR commented 8 years ago

This might help too http://bugs.python.org/issue21808

chimeno commented 8 years ago

same here after upgrading to macOS..

Python 2.7.11 httpstat 1.1.1

reorx commented 8 years ago

Please paste your command output, and the output of echo $LANG echo $LC_ALL here, thanks. Daniel Chimeno notifications@github.com于2016年9月28日 周三下午9:10写道:

same here after upgrading to macOS..

Python 2.7.11

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/reorx/httpstat/issues/8#issuecomment-250161706, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYx1LnzD0jPH78c7xF6-hVLr7viCz5Cks5qumc4gaJpZM4KFmxN .

chimeno commented 8 years ago
echo $LANG
es_ES.UTF-8
echo $LC_ALL
httpstat www.google.es

Could not decode json: Expecting property name: line 2 column 22 (char 23)
curl result: 0 {
"time_namelookup": 0,003,
"time_connect": 0,058,
"time_appconnect": 0,000,
"time_pretransfer": 0,058,
"time_redirect": 0,000,
"time_starttransfer": 0,153,
"time_total": 0,161,
"speed_download": 70545,000,
"speed_upload": 0,000
}
httpstat --version
httpstat 1.1.1
reorx commented 8 years ago

Your problem should be solved by upgrade httpstat, it's now 1.2.0 version.

The issue of windows remains, I'm waiting for suggestions from experienced windows cmd user. On Wed, Sep 28, 2016 at 22:11 Daniel Chimeno notifications@github.com wrote:

echo $LANG es_ES.UTF-8 echo $LC_ALL

httpstat www.google.es

Could not decode json: Expecting property name: line 2 column 22 (char 23) curl result: 0 { "time_namelookup": 0,003, "time_connect": 0,058, "time_appconnect": 0,000, "time_pretransfer": 0,058, "time_redirect": 0,000, "time_starttransfer": 0,153, "time_total": 0,161, "speed_download": 70545,000, "speed_upload": 0,000 }

httpstat --version httpstat 1.1.1

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/reorx/httpstat/issues/8#issuecomment-250177592, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYx1Bt6QUjV5771YtX4opxxQDCMROQeks5qunVIgaJpZM4KFmxN .

chimeno commented 8 years ago

I though was the last version :/ Yep, it works. thanks.

XhmikosR commented 8 years ago

The issue of windows remains, I'm waiting for suggestions from experienced windows cmd user.

I am an experienced Windows user myself. I still don't get why the error happens due to my limited Python knowledge. So, if you have any suggestions or patches you want me to try, let me know.

reorx commented 8 years ago

@XhmikosR Sorry for the ignorance before, I was away from GitHub for some days.

First of all thank you for your kindness to help, I will explain this problem to you as clearly as possible :)

There are two things we should know before understanding why this happens:

  1. People in the world use different Decimal Mark in different countries, mainly in two format: . (e.g. 0.003) and , (e.g. 0,003).
  2. httpstat tells curl to output statistics like this

    curl -w '{"time_connect": %{time_connect}}' google.com

    and decode the result as json for further use in the program.

For people in countries using point as decimal mark, like US or China, it will output {"time_connect": 0.069} which is a valid json. But for people using comma as decimal mark, it will output {"time_connect": 0,069}, which is not a valid json, thus if we use Python to decode it directly, the exception will happen.

$ python -c "import json; json.loads('{\"time_connect\": 0,069}')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 380, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 20 (char 19

To avoid this, httpstat calls curl with a environment set: LC_ALL=C, in Unix system, it means to let applications to use default language for output (more on stackoverflow), thus curl will use pointer decimal mark for number even your computer's localization setting uses comma decimal mark.

But when it comes to Windows, setting LC_ALL=C does not work at all, curl still follows system's localization, and eventually this problem happens.

From my perspective, I want to know whether application's language settings could be changed during runtime in Windows, no matter it's through environment variables or not, should there be a way, I can make httpstat do the same thing for curl. That's why I tried set LANG=C before, I also tried searching around for that, but still don't have a clue.

And lastly, if there's nothing like LC_ALL=C in Windows, there's still a way to solve this problem: change the output to {"time_connect": "0,069"} so that the number is treated as string, then I can replace , to . and convert the string back to number. This is tricky and not elegant, only to be a last resort.