Open mexicapita opened 4 years ago
I've experienced this as well (on DS 2.7.0 / skipper 2.5.1 ). Similar setup as @mexicapita (docker-compose). Its most evident when destroying and creating streams very fast one after the other. I've tried doing a check and wait based on
return scdf.streamOperations().list().getContent().stream().anyMatch(sd -> sd.getName().equals(name));
but it seems that even though the above says there is no stream, the exception happens.
If I put a time-based wait of 2+ seconds (just to check, no way I'm leaving that in), then no exception occurs
On Linux 5.6.13-100.fc30.x86_64 Docker version 19.03.12, build 48a66213fe docker-compose version 1.22.0, build f46880f
This is most likely because skipper makes a port range available for streams to connect to it, and they are all being used. If you are running with docker-compose you can, as a work-around, stop and start the skipper container, nevertheless a definitive solution should be provided, to release unused ports and prevent this DataFlowClientException
.
Perhaps this feature could be provided through the skipper shell.
We are using SCDF with Skipper 2.5.2 and also experiencing this issue regularly. We already increased the number of ports for the deployment to 200 via docker-compose but that did not solve the issue. We are redeploying all streams using SCDF shell scripts to roll out new versions but are often running into this error. After such a failure it always requires a lot of try-and-error to get the streams to deploy.
Is there a timeline or at least a workaround for the SCDF shell to prevent this kind of issues?
Thank you very much for your support.
Kind regards, Kristian
I experience this all the time. The exact message I'm seeing is:
Caused by: java.lang.IllegalStateException: Could not find a free random port range { low=20000, high=20100}
The only way to start deploying apps again is to restart skipper. Sometimes after a restart I have to destroy the streams, since Skipper reports that the apps are not in the expected state. I've also tried increasing the port range, but it still happens.
Versions: DATAFLOW_VERSION=2.7.2 SKIPPER_VERSION=2.6.2
I have detected a problem in the skipper server. When it has been running, deploying and stopping several streams for a while, it stops working and returns an error.
` Exception in thread "main" org.springframework.cloud.dataflow.rest.client.DataFlowClientException: Could not install AppDeployRequest [[AppDeploymentRequest@2f6d21ff commandlineArguments = list[[empty]], deploymentProperties = map['spring.cloud.deployer.group' -> 'test'], definition = [AppDefinition@2b95b70b name = 'log-v29', properties = map['spring.cloud.dataflow.stream.app.label' -> 'log', 'spring.cloud.stream.kafka.streams.binder.zkNodes' -> 'zookeeper:2181', 'spring.cloud.stream.metrics.properties' -> 'spring.application.name,spring.application.index,spring.cloud.application.,spring.cloud.dataflow.', 'spring.cloud.dataflow.stream.name' -> 'test', 'spring.cloud.stream.kafka.streams.binder.brokers' -> 'PLAINTEXT://kafka-broker:9092', 'spring.metrics.export.triggers.application.includes' -> 'integration**', 'spring.cloud.stream.metrics.key' -> 'test.log.${spring.cloud.application.guid}', 'spring.cloud.stream.bindings.input.group' -> 'test', 'spring.cloud.stream.kafka.binder.zkNodes' -> 'zookeeper:2181', 'spring.cloud.dataflow.stream.app.type' -> 'sink', 'spring.cloud.stream.bindings.input.destination' -> 'test.time', 'spring.cloud.stream.kafka.binder.brokers' -> 'PLAINTEXT://kafka-broker:9092']], resource = org.springframework.cloud.stream.app:log-sink-kafka:jar:2.1.2.RELEASE]] to platform [default]. Error Message = [Could not find a free random port range { low=20000, high=20100}]
`
If I try to continue deploying, three things can happen (I think randomly):
` Exception in thread "main" org.springframework.cloud.dataflow.rest.client.DataFlowClientException: Could not install AppDeployRequest [[AppDeploymentRequest@2f9e4e70 commandlineArguments = list[[empty]], deploymentProperties = map['spring.cloud.deployer.group' -> 'test'], definition = [AppDefinition@278b9959 name = 'log-v30', properties = map['spring.cloud.dataflow.stream.app.label' -> 'log', 'spring.cloud.stream.kafka.streams.binder.zkNodes' -> 'zookeeper:2181', 'spring.cloud.stream.metrics.properties' -> 'spring.application.name,spring.application.index,spring.cloud.application.,spring.cloud.dataflow.', 'spring.cloud.dataflow.stream.name' -> 'test', 'spring.cloud.stream.kafka.streams.binder.brokers' -> 'PLAINTEXT://kafka-broker:9092', 'spring.metrics.export.triggers.application.includes' -> 'integration**', 'spring.cloud.stream.metrics.key' -> 'test.log.${spring.cloud.application.guid}', 'spring.cloud.stream.bindings.input.group' -> 'test', 'spring.cloud.stream.kafka.binder.zkNodes' -> 'zookeeper:2181', 'spring.cloud.dataflow.stream.app.type' -> 'sink', 'spring.cloud.stream.bindings.input.destination' -> 'test.time', 'spring.cloud.stream.kafka.binder.brokers' -> 'PLAINTEXT://kafka-broker:9092']], resource = org.springframework.cloud.stream.app:log-sink-kafka:jar:2.1.2.RELEASE]] to platform [default]. Error Message = [App with deploymentId [test.log-v30] with state [deployed] doesn't match expected state [unknown]] at org.springframework.cloud.dataflow.rest.client.VndErrorResponseErrorHandler.handleError(VndErrorResponseErrorHandler.java:65) at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63) at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:778) at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:736) at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:670) at org.springframework.web.client.RestTemplate.postForObject(RestTemplate.java:414) at org.springframework.cloud.dataflow.rest.client.StreamTemplate.deploy(StreamTemplate.java:127) at org.springframework.cloud.dataflow.rest.client.dsl.StreamDefinition.deploy(StreamDefinition.java:78) at org.springframework.cloud.dataflow.rest.client.dsl.StreamDefinition.deploy(StreamDefinition.java:87) at com.grupotsk.springflow.prometheuswrapperdsl.PrometheusWrapperDslApplication.main(PrometheusWrapperDslApplication.java:65)
`
I already tried to ask about this on gitter without success. I'm running this docker-compose.