victims / victims-web

The victims web application.
GNU Affero General Public License v3.0
8 stars 5 forks source link

502 when trying to fetch updates #42

Closed abn closed 11 years ago

abn commented 11 years ago
Failed to sync database: Server returned HTTP response code: 502 for URL: https://victims-websec.rhcloud.com//service/v2/remove/1900-01-01T00:00:00

Not sure if this OpenShift's doing or the flask web-server. If it is the latter, should we consider alternatives? eg: CherryPy? (http://flask.pocoo.org/snippets/24/)

We also need to be able chunk the json response.

ashcrow commented 11 years ago

Interesting. I can't even hit the application at all anymore. Once I'm home I'll try to check logs and see what's up. Last time I saw something similar it turned out to be the MongoDB being ripped out from under the app.

ashcrow commented 11 years ago

Also +1 on chucked encoding.

ashcrow commented 11 years ago

I'm seeing the same error in the logs where the MongoDB connection seems to get cut while getting data. Still looking.

ashcrow commented 11 years ago

I just tried the URL above and I'm getting back:

[]
ashcrow commented 11 years ago

@abn can you see if you are getting the same error?

ashcrow commented 11 years ago

Right now we are using mod_wsgi in Apache: http://flask.pocoo.org/docs/deploying/mod_wsgi/. I'm not opposed to using something different though.

Also, looks like http://flask.pocoo.org/docs/patterns/streaming/ is a good pattern for returning the larger json items.

ashcrow commented 11 years ago

@abn if you are still getting the error please reopen this issue. AFAICT it's no longer happening. The logs seemed to indicate that the Mongo database cut connections earlier today which may have been the cause. After restarted the container (and it reestablished connections) everything was back to normal.

dfj commented 11 years ago

The service seems to be quite intermittent. When it is working, it still doesn't give me complete JSON responses. I think the issue is the server side imposing some kind of constraint on process execution time, or max download size. For example with victims-enforcer I get:

+=============================+ |ENFORCE-VICTIMS-RULE SETTINGS| +=============================+ dbdriver : org.h2.Driver

fingerprint : fatal

 updates : auto

     url : https://victims-websec.rhcloud.com/service/v2

   dburl : jdbc:h2:.victims

metadata : warning

[info] Victims database last entry was created on Thu Jan 01 10:00:00 EST 1970. [info] Synchronizing CVE definitions with local database.. com.google.gson.stream.MalformedJsonException: Use JsonReader.setLenient(true) to accept malformed JSON at line 1 column 10 at com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1310) at com.google.gson.stream.JsonReader.checkLenient(JsonReader.java:963) at com.google.gson.stream.JsonReader.readLiteral(JsonReader.java:1211) at com.google.gson.stream.JsonReader.nextValue(JsonReader.java:789) at com.google.gson.stream.JsonReader.peek(JsonReader.java:367) at com.google.gson.stream.JsonReader.expect(JsonReader.java:337) at com.google.gson.stream.JsonReader.beginArray(JsonReader.java:306) at com.redhat.victims.Synchronizer.sync(Synchronizer.java:135) at com.redhat.victims.Synchronizer.synchronizeDatabase(Synchronizer.java:170) at com.redhat.victims.VictimsRule.execute(VictimsRule.java:93) at org.apache.maven.plugins.enforcer.EnforceMojo.execute(EnforceMojo.java:190) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:322) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:158) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196) at org.apache.maven.cli.MavenCli.main(MavenCli.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352) [WARNING] Rule 0: com.redhat.victims.VictimsRule failed with message: Database synchronization failed. [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 3.187s [INFO] Finished at: Mon May 13 17:19:44 EST 2013 [INFO] Final Memory: 13M/210M [INFO] ------------------------------------------------------------------------

Trying to do the sync manually via wget:

$ wget https://victims-websec.rhcloud.com/service/v2/update/2013-01-01T00:00:00/ --2013-05-13 17:21:44-- https://victims-websec.rhcloud.com/service/v2/update/2013-01-01T00:00:00/ Resolving victims-websec.rhcloud.com (victims-websec.rhcloud.com)... 174.129.166.85 Connecting to victims-websec.rhcloud.com (victims-websec.rhcloud.com)|174.129.166.85|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 51820153 (49M) [text/html] Saving to: ‘index.html’

2% [> ] 1,071,688 --.-K/s in 19s

2013-05-13 17:22:46 (56.5 KB/s) - Connection closed at byte 1071688. Retrying.

--2013-05-13 17:22:47-- (try: 2) https://victims-websec.rhcloud.com/service/v2/update/2013-01-01T00:00:00/ Connecting to victims-websec.rhcloud.com (victims-websec.rhcloud.com)|174.129.166.85|:443... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2013-05-13 17:22:48 ERROR 503: Service Temporarily Unavailable.

So it looks like the server side is dropping the connection 1mb into the 50mb response :(

abn commented 11 years ago

@ashcrow now I am getting a 500 for https://victims-websec.rhcloud.com/service/v2/update/1900-01-01T00:10:20/

I think this should hopefully be fixed with #44

As @dfj 's comment mentions, the service is dying off before serving the entire response.

ashcrow commented 11 years ago

I think you guys are right and there is a max set as to how much time/memory/something a single response can take.

ashcrow commented 11 years ago

I'm hoping streaming will fix this. If not we may have to look at some other options for responding with such a "large" data set.

abn commented 11 years ago

@ashcrow is streaming live?

ashcrow commented 11 years ago

@abn not yet, no. I opened up #43 to push out all the latest updates. I'm going to try to do it either this afternoon or tomorrow afternoon. Everything that has made it in by that time will be live.

abn commented 11 years ago

:+1:

ashcrow commented 11 years ago

@abn, latest code was pushed out. The service so far is working for me -- though it's not the fastest thing in the world. Let me know if it stays stable for you guys or if the connections get killed off.

abn commented 11 years ago
[abn@whippersnapper ~]$ curl "https://victims-websec.rhcloud.com/service/v2/update/2010-01-01T01:01:01" | tee vic.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   547  100   547    0     0      2      0  0:04:33  0:04:02  0:00:31   132
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/service/v2/update/2010-01-01T01:01:01">GET&nbsp;/service/v2/update/2010-01-01T01:01:01</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
<hr>
<address>Apache/2.2.15 (Red Hat) Server at victims-websec.rhcloud.com Port 443</address>
</body></html>
[abn@whippersnapper ~]$ 

And I doubt if the streaming is working :(

dfj commented 11 years ago

Using victims-enforcer:

[info] Synchronizing CVE definitions with local database.. com.google.gson.stream.MalformedJsonException: Use JsonReader.setLenient(true) to accept malformed JSON at line 1 column 1

$ wget https://victims-websec.rhcloud.com/service/v2/update/2013-01-01T00:00:00/ --2013-05-14 09:11:11-- https://victims-websec.rhcloud.com/service/v2/update/2013-01-01T00:00:00/ Resolving victims-websec.rhcloud.com... 174.129.166.85 Connecting to victims-websec.rhcloud.com|174.129.166.85|:443... connected. HTTP request sent, awaiting response... 502 Proxy Error 2013-05-14 09:15:12 ERROR 502: Proxy Error.

:(

ashcrow commented 11 years ago

@abn well it was worth the shot.

Apologies @dfj, your error was due to me force restarting.

abn commented 11 years ago

@ashcrow seems to be working now since the restart.

abn commented 11 years ago

waiting for it to complete so i can verify the data

abn commented 11 years ago

Works for me (hopefully reproducible)

[abn@whippersnapper ~]$ cat vic.json | grep -o "cves" | wc -l
352
abn commented 11 years ago

Looks like it is very temperamental (wonder why though).

Tried https://victims-websec.rhcloud.com/service/v2/update/2014-01-01T01:01:01 (even thought there should be no results, it 502s).

@ashcrow would this be something caused by OpenShift?

dfj commented 11 years ago

I think the solution is move off openshift. I am happy to provide a VM on the rackspace cloud to host it, billed to my own account. If we prove that this solves the problems, then I can request funding for the ongoing operation of the VM. @ashcrow what do you think of that idea? If you're OK with it, just let me know and I'll create the VM and give you credentials to login.

ashcrow commented 11 years ago

Let's give a full, controlled vm a shot. I have room for another vm we can use for the test bed. I'm setting it up now and will try to deploy to it tomorrow mid day EDT for testing morning Wed morning your time.

ashcrow commented 11 years ago

I've setup a test instance on a VM. Today please try the same tests against this VM instance and see if we continue to get 5xx's or if things work.

abn commented 11 years ago

https://victims-websec.rhcloud.com/service/v2/update/2014-01-01T01:01:01

returns [] correctly. Testing bigger requests.

ashcrow commented 11 years ago

@abn note that the rhcloud is going to be the original instance. I've sent an email with test instance location. Booting it back up now in case you'd like to look at it.

abn commented 11 years ago

is the base uri the same??

ashcrow commented 11 years ago

Also, @abn, do you ever sleep? :-)

abn commented 11 years ago

sure will test now. @ashcrow can you hit me @ gmail please?

ashcrow commented 11 years ago

@abn will do.

Done.

abn commented 11 years ago

2 simultaneous requests for service/v2/update/2010-01-01T01:01:01/

HTTP/1.1 200 OK
Date: Tue, 14 May 2013 16:34:06 GMT
Server: Apache/2.2.23 (Fedora)
Set-Cookie: victims=; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
Connection: close
Transfer-Encoding: chunked
Content-Type: application/json
abn commented 11 years ago

rhcloud works too, however is unreliable the dedicated vm seems to be more reliable.

service/v2/update/2014-01-01T01:01:01/

HTTP/1.1 200 OK
Date: Tue, 14 May 2013 16:37:38 GMT
Server: Apache/2.2.23 (Fedora)
Set-Cookie: victims="SNIPPED"; Path=/; HttpOnly
Connection: close
Transfer-Encoding: chunked
Content-Type: application/json

Interesting bit here is that the cookie was set correctly here but not the previous request.

abn commented 11 years ago

Service seems a tiny bit slow though, do we want to investigate that and fine tune in another issue? (not bad bad)

abn commented 11 years ago

Dedicated VM :+1:

Openshift gives 500 on second request.

abn commented 11 years ago

@ashcrow also sleep? What be this thing you speak off?

abn commented 11 years ago

Raised #52 as a longer term issue to look at perf.

ashcrow commented 11 years ago

@abn yes, it is slow but I think that is to be expected. 50MB transfer and the client can not really load the json struct until all the data is been downloaded due to it being contained in a list (... I think).

abn commented 11 years ago

standalone client works with vm host

[abn@chaotic target (master)]$ java -jar victims-client-1.0-SNAPSHOT-standalone.jar /home/abn/.m2/repository/org/springframework/spring/2.5.6/spring-2.5.6.jar
Synchronizing database with web service.
Sync complete.
Scanning: /home/abn/.m2/repository/org/springframework/spring/2.5.6/spring-2.5.6.jar
Spring Framework:null:2.5.6 matched [CVE-2009-1190, CVE-2010-1622, CVE-2011-2730]
Scanning Complete: /home/abn/.m2/repository/org/springframework/spring/2.5.6/spring-2.5.6.jar

@gcmurphy enforce still fails, but this seems to be a different issue

[info] Victims database last entry was created on Thu Jan 01 10:00:00 EST 1970.
[info] Synchronizing CVE definitions with local database..
org.h2.jdbc.JdbcSQLException: NULL not allowed for column "VERSION"; SQL statement:
INSERT INTO victims(cves, vendor, name, created, version, submitter, format, status, file_hash) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) [23502-171]
abn commented 11 years ago

@ashcrow might be yes, but that is peculiar for a request that returns "[]"

ashcrow commented 11 years ago

@abn oh, yes, very much. We can look at less controller based transforms and better caching as well.

ashcrow commented 11 years ago

@abn any objection to closing this since we now know what the fix is and have open tickets for them?

abn commented 11 years ago

I have raised #52 for a general perf review. Would be interesting to see how much we can squeeze out of it.

abn commented 11 years ago

@ashcrow no objections closing now.

The fix for this issue will be migrating the service to a dedicated VM. #51

@dfj fix for enforcer will be at https://github.com/victims/victims-enforcer/issues/3