streetcomplete / StreetComplete

Easy to use OpenStreetMap editor for Android
https://streetcomplete.app
GNU General Public License v3.0
3.9k stars 357 forks source link

integration tests are prone to failing with java.net.SocketTimeoutException #1010

Closed matkoniecz closed 5 years ago

matkoniecz commented 6 years ago

It applies to both add_housenumber and construction group of integration tests.

This tests sometimes fail with de.westnordost.osmapi.common.errors.OsmConnectionException: java.net.SocketTimeoutException: failed to connect to overpass-api.de/178.63.11.215 (port 443) after 45000ms: isConnected failed: ETIMEDOUT (Connection timed out)

It is happening despite overpass turbo operating without any issues.

It was happening also before I started making any changes to integration tests and from what I experienced starting to use attic data failed to change frequency of this issue.

Failures are intermittent, typically disappearing once I decide to debug this issue :)

I think that implementing https://github.com/westnordost/osmapi/issues/16 would be a good first step toward figuring what is causing this problem - can you confirm that it would be a viable PR?

westnordost commented 6 years ago

Well I also noticed some hickups on overpass-api.de today.

I'll just assign you here, because I never experienced these. And even if there are some, there is nothing we can do about it on our side about a socket timeout exception so not sure what your plan would be here.

matkoniecz commented 6 years ago

so not sure what your plan would be here.

Hopefully it is a bug. I encountered similar issue before in a different project and it was caused by network library terminating connection too early (timeout for connection was lower that timeout in an overpass query).

It may be also preferable to retry connection after pause rather than report failure (brittle tests currently are not too horrible, but it is a bit irritating and random failures would make #1003 utterly pointless).

matkoniecz commented 5 years ago

For now I tested it on a different Internet connection and on a different phone to be sure that problem is not caused by my WIFI.

The same problem continued. Obviously, it may be problem caused by my laptop or by poorly made smartphone (both phones are the same low-quality Xiaomi).

westnordost commented 5 years ago

We need something like Volkswagen for Java ;-)

rugk commented 5 years ago

But as the issue also happens on Travis (see https://github.com/westnordost/StreetComplete/pull/1113, which I guess is very likely the same issue), there is very likely something wrong here. :smile:

matkoniecz commented 5 years ago

@rugk thanks, good point. I guess I can skip further checks whatever this problem is caused by my hardware/configuration.

rugk commented 5 years ago

Yeah Travis runs these on Amazon AWS somewhere, AFAIK. Maybe it's also just something with the remote servers it calls (i.e. Overpass?). Maybe it hits some issues there…

HolgerJeromin commented 5 years ago

Can we mockup the answers for the tests?

matkoniecz commented 5 years ago

Part of the point of tests is to detect that Overpass Turbo changed format of queries or outputs.

And it is not theoretical, one of quests (construction one?) used syntax that was not legal, but was accepted by Overpass Turbo. Update changed it and queries started to fail.

Mocking would change it from integration test to an unit test what is undesirable in this case.

rugk commented 5 years ago

Mocking would change it from integration test to an unit test what is undesirable in this case.

Well… theoretically (ideally :wink:) you have both, of course. :smile:

mmd-osm commented 5 years ago

one of quests (construction one?) used syntax that was not legal, but was accepted by Overpass Turbo

You probably mean Overpass API. Overpass Turbo is a web frontend, and has no built in syntax checking for Overpass QL.

matkoniecz commented 5 years ago

Yes, sorry. I keep confusing Overpass API with Overpass Turbo.

mmd-osm commented 5 years ago

Back on topic....

Update changed it and queries started to fail.

There was a side effect due to with stricter semicolon checks causing some issues during 0.7.54 -> 0.7.55 update. Overall, those issues shouldn't happen that often.

Unfortunately, there's no good way to run a syntax check on a query without querying the database. Maybe you could create an enhancement request, if this is helpful.

westnordost commented 5 years ago

This is solved now (by not running the integration tests on travis)