thrift parse error when uploading results.

scphantm commented 5 years ago

Describe the bug i think this is related to #2387 , but any time i store my runs, i get the following

[DEBUG][2019-10-25 14:30:39] {system} [4199] <140318562117440> - webserver_context.py:192 get_context() - Loading layout config.
[DEBUG][2019-10-25 14:30:39] {system} [4199] <140318562117440> - webserver_context.py:196 get_context() - /net/nfs.paneast.panasas.com/software/codechecker/rhel_7_amd64/codechecker-6.10.1/build/CodeChecker/config/package_layout.json
[DEBUG][2019-10-25 14:30:39] {system} [4199] <140318562117440> - webserver_context.py:206 get_context() - {u'checker_md_docs': u'www/docs/checker_md_docs', u'bin': u'bin', u'www': u'www', u'ld_logger_lib_name': u'ldlogger.so', u'docs': u'www/docs', u'checkers_severity_map_file': u'config/checker_severity_map.json', u'lib_plist_to_html': u'lib/python2.7/plist_to_html', u'lib_tu_collector': u'lib/python2.7/tu_collector', u'analyzers': {u'clangsa': u'clang', u'clang-tidy': u'clang-tidy'}, u'ld_logger_lib_path': u'ld_logger/lib', u'web_client_dojo': u'www/scripts/plugins/dojo', u'plist_to_html_bin': u'bin/plist-to-html', u'ld_logger_bin': u'bin/ldlogger', u'plist_to_html_dist_path': u'lib/python2.7/plist_to_html/static', u'plist_to_html': u'plist_to_html', u'web_client_codemirror': u'www/scripts/plugins/codemirror', u'ld_logger': u'ld_logger', u'lib': u'lib', u'plugin': u'plugin', u'web_client_jsplumb': u'www/scripts/plugins/jsplumb', u'config_db_migrate': u'lib/python2.7/codechecker_server/migrations/config', u'web_client_highlightjs': u'www/scripts/plugins/highlightjs', u'web_client_plugins': u'www/scripts/plugins', u'run_db_migrate': u'lib/python2.7/codechecker_server/migrations/report', u'web_client_marked': u'www/scripts/plugins/marked', u'cc_bin': u'cc_bin', u'js_thrift': u'www/scripts/plugins/thrift', u'userguide': u'www/userguide', u'config': u'config', u'tu_collector': u'tu_collector', u'web_client': u'www/scripts/codechecker-api'}
[ERROR][2019-10-25 14:31:10] {system} [4199] <140318562117440> - thrift_call.py:67 wrapper() - Thrift invalid data error.
[ERROR][2019-10-25 14:31:10] {system} [4199] <140318562117440> - thrift_call.py:74 wrapper() - massStoreRun
[ERROR][2019-10-25 14:31:10] {system} [4199] <140318562117440> - thrift_call.py:75 wrapper() - ('rhel_7_amd64-live_test-jenkins_update', None, '6.10.1', '[base64 of file]'
[ERROR][2019-10-25 14:31:10] {system} [4199] <140318562117440> - thrift_call.py:76 wrapper() - {}
[ERROR][2019-10-25 14:31:10] {system} [4199] <140318562117440> - thrift_call.py:77 wrapper() - Request failed.
Traceback (most recent call last):
  File "/net/nfs.paneast.panasas.com/software/codechecker/rhel_7_amd64/codechecker-6.10.1/build/CodeChecker/lib/python2.7/codechecker_client/thrift_call.py", line 38, in wrapper
    res = func(*args, **kwargs)
  File "/net/nfs.paneast.panasas.com/software/codechecker/rhel_7_amd64/codechecker-6.10.1/build/CodeChecker/lib/python2.7/codechecker_api/codeCheckerDBAccess_v6/codeCheckerDBAccess.py", line 1679, in massStoreRun
    return self.recv_massStoreRun()
  File "/net/nfs.paneast.panasas.com/software/codechecker/rhel_7_amd64/codechecker-6.10.1/build/CodeChecker/lib/python2.7/codechecker_api/codeCheckerDBAccess_v6/codeCheckerDBAccess.py", line 1696, in recv_massStoreRun
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "/opt/codechecker/venv/lib/python2.7/site-packages/thrift/protocol/TJSONProtocol.py", line 317, in readMessageBegin
    self.readJSONArrayStart()
  File "/opt/codechecker/venv/lib/python2.7/site-packages/thrift/protocol/TJSONProtocol.py", line 305, in readJSONArrayStart
    self.readJSONSyntaxChar(LBRACKET)
  File "/opt/codechecker/venv/lib/python2.7/site-packages/thrift/protocol/TJSONProtocol.py", line 213, in readJSONSyntaxChar
    "Unexpected character: %s" % current)
TProtocolException: Unexpected character: <

it looks like this is very similar to #1120, except my files are about 20ish megs. but the part that caught my attention in #1120 is the authentication conversation. Because of issue #2387, my authentication is completely hosed without much hope of fixing it without huge changes to my openshift cluster im just not going to do. my authentication is disabled in the server_config.json file, could this be an auth issue?

odd thing, the data is still making it to the server because the server is importing it correctly, it appears.

CodeChecker version 6.10.1

csordasmarton commented 5 years ago

@scphantm You get this error message if your response to a thrift request is not valid.

For example we used nginx to load balance between multiple server instances. By default the value of proxy_read_timeout was too low. If we tried to store to the server and it took more than the specified value nginx throw back a simple html page as a response to a thrift request which is not valid thrift response. Unfortunately Thrift will print out only the first character of the response message. Maybe your problem will be similar to ours.

scphantm commented 5 years ago

why are you using thrift? I've been reading code for a few days now and I just can't get over the feeling that you using thrift made this MUCH more complex than it needed to be. I have some off beat platforms and i spent 2 weeks simply trying to get thrift installed on those platforms so i can compile your stuff.

scphantm commented 5 years ago

for this error specifically, is there a way that you can set it up where if you do get a thrift error, you print the raw response in the log. That way i can at least debug the stupid thing?

gyorb commented 5 years ago

We choose thrift when the project was started because at that time no other solution supported to generate js stubs for the browser and we wanted to use the same api for the cli and any other clients. grpc did not support to generate js stubs at that time (grpc-web v1.0.0 was released last year)

To depend less on the thrift-compiler binary we started to create a separate repository where the stubs are already generated and will be downloaded as a git submodule when the project is cloned. This was already planned because many linux distributions does not have a package with the latest thrift compiler.

The problem you run into is actually caused by the thrift source code in /opt/codechecker/venv/lib/python2.7/site-packages/thrift/protocol/TJSONProtocol.py where it tried to parse a json but it got an html as a response from the nginx server. It is not yet in our source code, where we could print out the received message.

What we did is we modified the nginx config to respond to the errors with some valid json which can be parsed and printed out by the stub: Something like this:

location /t_5xx.json {
   default_type application/x-thrift;
   return 500 '[1,"nginx",3,0,{"1":{"str":"Error code $status: Internal server error."}}]';
}
error_page 500 501 502 503 504 505 506 507 508 509 510 511 = /t_5xx.json;

scphantm commented 5 years ago

yea, problem is i don't have an ngnix server in between, i have a customized ha proxy. i will kick the timeout on the route to 10 minutes or something and see if that clears it up.

scphantm commented 5 years ago

BTW, DON'T use submodules, they are a royal pain. bring your stubs in via a subtree instead

https://www.atlassian.com/git/tutorials/git-subtree

gyorb commented 5 years ago

If the increased timeout does not solve the problem probably ha proxy has some similar feature for custom errors like nginx.

Thanks for mentioning git subtrees we will check it.

scphantm commented 5 years ago

Problem with openshift, you don't have access to the HA server. because it may not be an HA server, it may be an F5 load balancer, or a proxy firewall, or any of another half dozen enterprise hardware. Thats why redhat created routes instead of using ingress, they wanted the ability to integrate into the 6 digit enterprise scale network firewalls and load balancers that any enterprise will already have. HA Proxy just happens to be the default if you don't specify anything else. Therefor, it doesn't give you the ability to modify the configuration of the HA server. Because, that modification may not be possible on an F5 load balancer. Thats why routes don't support 1 or more url rewrites, F5's don't support it.

and for the subtree, i have a better idea. Just package the stubs as a pypi and npm module. I just had to do some very scary module path injections so my integration program had the ability to load the thrift stubs and call the server to get some info it needs to decide if the build is allowed to proceed or not.

Move the thrift stuff source files and everything into its own project, then compile them in different languages, javascript, python, java would be VERY nice, etc. then publish them as interface clients. Then crazy people like me that need to integrate your server into other pieces of the pipeline don't have to do strange and unusual things to call the server's API. i just load the client stubs and go.

Ericsson / codechecker

thrift parse error when uploading results. #2409