EIDA / mediatorws

EIDA NG Mediator/Federator web services
GNU General Public License v3.0
6 stars 6 forks source link

[INGV]: HTTP status code 204 #90

Closed damb closed 4 years ago

damb commented 4 years ago

Follow up from the eida_maint mailing list.

Quoting @petrrr:

Status 204 through federator

We would need to understand why some requests to through the federator return systematically(?) with status 204, while there seem to be no problem when doing the same directly. This even is evident from test done at ETH, so this should not be related to request rates. Only the fdsnws station services seems to be effected, while data select does not show this.

We also see this behaviour for some of the GUI/Quality services provided by KNMI. Looks like these rely on the federator.

damb commented 4 years ago

@petrrr,

in order to debug the issue I need more detailed information i.e.:

Best if you post here an exemplary request.

Thanks.

petrrr commented 4 years ago

Only text seems to be affected by this issue:

query?level=channel&format=text&ne [ <=> ] 41.73K --.-KB/s in 0.05s

2020-07-21 12:47:55 (823 KB/s) - ‘query?level=channel&format=text&network=GU.1’ saved [42736]


* http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=text&network=GU

[budvar:~] petr% wget "http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=text&network=GU" --2020-07-21 12:46:33-- http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=text&network=GU Resolving eida-federator.ethz.ch (eida-federator.ethz.ch)... 129.132.144.214 Connecting to eida-federator.ethz.ch (eida-federator.ethz.ch)|129.132.144.214|:80... connected. HTTP request sent, awaiting response... 204 No Content 2020-07-21 12:46:34 (0.00 B/s) - ‘query?level=channel&format=text&network=GU’ saved [0]


* http://webservices.ingv.it/fdsnws/station/1/query?level=channel&format=xml&network=GU

[budvar:~] petr% wget "http://webservices.ingv.it/fdsnws/station/1/query?level=channel&format=xml&network=GU" --2020-07-21 12:49:13-- http://webservices.ingv.it/fdsnws/station/1/query?level=channel&format=xml&network=GU Resolving webservices.ingv.it (webservices.ingv.it)... 93.63.207.206 Connecting to webservices.ingv.it (webservices.ingv.it)|93.63.207.206|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/xml] Saving to: ‘query?level=channel&format=xml&network=GU.1’

query?level=channel&format=xml&net [ <=> ] 238.38K --.-KB/s in 0.1s

2020-07-21 12:49:13 (1.57 MB/s) - ‘query?level=channel&format=xml&network=GU.1’ saved [244096]


* http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=xml&network=GU

[budvar:~] petr% wget -O GU.xml "http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=xml&network=GU" --2020-07-21 12:51:08-- http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=xml&network=GU Resolving eida-federator.ethz.ch (eida-federator.ethz.ch)... 129.132.144.214 Connecting to eida-federator.ethz.ch (eida-federator.ethz.ch)|129.132.144.214|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 243814 (238K) [application/xml] Saving to: ‘GU.xml’

GU.xml 100%[================================================================>] 238.10K --.-KB/s in 0.06s

2020-07-21 12:51:08 (3.62 MB/s) - ‘GU.xml’ saved [243814/243814]

damb commented 4 years ago

@petrrr, the current deployment is quite different from the one when the issue was opened. However, eidaws-federator implements retry budget facilities in order to prevent DCs from overload. I.e. if an endpoint FDSNWS resource doesn't respond, seems to be overloaded (HTTP status code 503 is returned) or in case of HTTP status code 500 (Internal Server Error), endpoint requests aren't forwarded to the endpoint DC (after exceeding a threshold) for a configurable duration. Those statistics are performed on datacenter-resource granularity. Besides, the number of concurrent requests to a single DC is limited (based on HTTP connection pooling). Also, this parameter is configurable on datacenter granularity. Note that eidaws-federator does not cope with a FDSNWS deployment where the requests per second (RPS) are limited by DCs. Instead DCs should whitelist the federator and communicate how many concurrent HTTP connections are accepted. Limiting the RPS is absolutely reasonable for usual clients. Though, trotteling a gateway service such as eidaws-federator by such means doesn't work out in a distributed service environment. Additionally, it should be mentioned that eidaws-federator caches fdsnws-station requests on multiple levels. This approach has the advantage that a) responses are returned faster and hence the user experience is better and b) endpoint DCs aren't overloaded.

To conclude: If your initial request to http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=xml&network=GU produced lots of HTTP 500/503 status codes due to the reasons mentioned above, then a second request issued right afterwards to http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=text&network=GU might temporarily lead to HTTP status code 204.

petrrr commented 4 years ago

@damb: I am not really sure I understand your reply here. However, I was quite a bit surprised that to see reply status with 204 for text and 200 for xml format. In the end you are doing this request to the same endpoint and the text request should also be much less resource demanding.

In any case I guess providing a status code 204, might be rather misleading if there are really problems which are silently ignored.

We of cause have no visibility of any problems you might be observing. However, as far as I am informed the federator has been white-listed already quite some time ago. Moreover, the outgoing bandwidth also has been significantly increased. In case you observe any issues with the INGV service endpoint, you should probably also let know the maintainers of the services. But I would suggest to take this details offline. They are off-topic.

In any case, retrying today I got a reply and status code 200. So technically I guess this issue could be closed. I am just wondering why the performance of the text reply is so low. It seems not to reflect the endpoints performance:

 wget -O test.xml "http://webservices.ingv.it/fdsnws/station/1/query?level=channel&format=text&network=IV" ; ;
--2020-07-27 19:54:29--  http://webservices.ingv.it/fdsnws/station/1/query?level=channel&format=text&network=IV
Resolving webservices.ingv.it (webservices.ingv.it)... 93.63.207.206
Connecting to webservices.ingv.it (webservices.ingv.it)|93.63.207.206|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘test.xml’

test.xml                            [   <=>                                                  ] 742.86K  1.30MB/s    in 0.6s    

2020-07-27 19:54:30 (1.30 MB/s) - ‘test.xml’ saved [760687]
petrrr commented 4 years ago

This is what I get when asking to the federator. The reply took more then 10min, probably timed-out and is incomplete. But this might be unrelated, just let me know if I am supposed to anything on this or submit a different issue!

Mon 27 Jul 2020 19:51:02 CEST
--2020-07-27 19:51:02--  http://eida-federator.ethz.ch/fdsnws/station/1/query?level=channel&format=text&network=IV
Resolving eida-federator.ethz.ch (eida-federator.ethz.ch)... 129.132.144.214
Connecting to eida-federator.ethz.ch (eida-federator.ethz.ch)|129.132.144.214|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘query?level=channel&format=text&network=IV’

            query?level=chann     [   <=>                                            ]  42.35K  70.1KB/s               
                      query?l     [                              <=>                 ] 100.83K   551 B/s               ^query?level=channel&format=te     [                             <=>                  ] 244.19K   142 B/s    in 9m 59s  

2020-07-27 20:01:07 (417 B/s) - ‘query?level=channel&format=text&network=IV’ saved [250054]
damb commented 4 years ago

@petrrr,

so now we're discussing two totally different issues which are:

petrrr commented 4 years ago

@damb: I guess we are now hitchhiking this issue. So I'd suggest we close this one. The specific problem described initially is gone. We can open more specific ones if this is helpful