Closed sabbyanandan closed 5 years ago
I believe this is a symptom of #58 . If @cppwfs agrees, we can close this one as a duplicate.
This issue is on CTR v2.0 line (builds on SCT v2.0), though. #58 relates to SCT v2.1, however.
These are separate issues.
Hello @Rostish, I'm still having problems reproducing the problem. I ran the following graph (each task was a timestamp) 1) CTR 2.0.2 on SCDF 1.7.2 for 80 times 2) CTR 2.0.2 on SCDF 2.0 127 times (DB used was mysql).
This graph was constructed after reviewing your log and deriving the basic flow of what you were trying to run.
The command I executed looked like this:
java -jar composedtaskrunner-task-2.0.2.RELEASE.jar --spring.cloud.task.closecontextEnabled=true --increment-instance-enabled=true --split-thread-core-pool-size=4 --interval-time-between-checks=1000 --graph=""logrunme-1&&logrunme-2&&<logrunme-3||logrunme-4||logrunme-5||logrunme-6>&&<logrunme-7||logrunme-8||logrunme-9>&&logrunme-10&&<logrunme-11||logrunme-12>&&<logrunme-13||logrunme-14>&&logrunme-15&&<logrunme-16||logrunme-17||logrunme-18>&&logrunme-19&&<logrunme-20||logrunme-21||logrunme-22>"
Can you see a difference in my test case above and what you are executing?
@cppwfs Good day for you!
i pass next arguments via REST Client launch command: --dataflow-server-uri: http://10.101.48.150:9494 (could it connect with problem?) --split-thread-core-pool-size: 5(as i see, you use 4 value) --increment-instance-enabled: true (the same)
And i pass next arguments via DSL(in your example you didn't use any arguments in DSL): --runner.localDate=2018-12-08 --spring.cloud.consul.config.datakey=calculate-vm-click-statistic --runner.mode=EXEC
a little example:
calculate-vm-click-statistic: multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=calculate-vm-click-statistic --runner.mode=EXEC && <average-
genre-statistic-calculation-online-vm: multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=average-genre-statistic-calculation-online-vm --runner.mode=EXEC
|| average-click-statistic-calculation-online-web: multirating-baseoperation --runner.localDate=2018-12-
08 --spring.cloud.consul.config.datakey=average-click-statistic-calculation-online-web --
runner.mode=EXEC || average-click-statistic-calculation-online-vm: multirating-baseoperation --
runner.localDate=2018-12-08 --spring.cloud.consul.config.datakey=average-click-statistic-calculation-
online-vm --runner.mode=EXEC || average-genre-statistic-calculation-online-web: multirating-
baseoperation --runner.localDate=2018-12-08 --spring.cloud.consul.config.datakey=average-genre-
statistic-calculation-online-web --runner.mode=EXEC> && average-genre-statistic-calculation-off:
multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=average-genre-statistic-calculation-off --runner.mode=EXEC &&
fusion-v2: multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=fusion-v2 --runner.mode=EXEC && aggregation-transformation:
multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=aggregation-transformation --runner.mode=EXEC && export-
infosys: multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=export-infosys --runner.mode=EXEC && combine-infosys:
multirating-baseoperation --runner.localDate=2018-12-08 --
spring.cloud.consul.config.datakey=combine-infosys --runner.mode=EXEC
And the main difference in executed tasks, i use my custom task for all executions. It has next bootstrap.yaml(i use consul):
runner:
localDate: **pass this argument via dsl**
mode: **pass this argument via dsl**
spring:
application:
name: multi-rating-operations
cloud:
consul:
config:
watch:
enabled: false
enabled: true
prefix: ""
datakey: **pass this argument via dsl**
format: yaml
host: 10.101.48.150
port: 8500
discovery:
prefer-ip-address: true
enabled: false
jpa:
properties:
hibernate:
jdbc:
lob:
non_contextual_creation: true
datasource:
url: jdbc:postgresql://192.168.21.70:5432/data_flow
username: xxxxxxxx
password: xxxxxxxx
driver-class-name: org.postgresql.Driver
logging:
level:
org:
springframework:
cloud:
task: debug
dataBusRest:
dataSourceUrl: 10.101.48.150
user: xxxxxxxx
password: xxxxxxxx
port: 10888
I could try to debug CTR by my self. Could you share your metodology for me? Or i just need to download sources of CTR and try start it like you using java -jar command.
Are you including --spring.cloud.task.closecontextEnabled=true
in your parameters? That is required.
I will try after holidays in my country. I coudn't do it right now, because my code is availabe only from my work place.
I was able reproduce it somewhat.
Using the same graph and tooling except in this case I used a SCDF-Local to launch docker images like you discussed previously.
What occurred was after running the CTR instance 50 times one of the CTR executions appeared to stop. In this case one of the child apps failed to start because of the following error docker: Error response from daemon: driver failed programming external connectivity on endpoint stupefied_hypatia (fc8f22b557ad6dd9ea4c692792dab9e9259c0ae872cf02d1397409c99171f4d0): Error starting userland proxy: listen tcp 0.0.0.0:58386: bind: address already in use.
This error appeared in the stderr log of the child task.
So CTR was waiting for the child application to start which it never did and thus CTR was effectively blocked.
The solution to this is to set the max-wait-time
as discussed here: https://github.com/spring-cloud-task-app-starters/composed-task-runner/blob/master/spring-cloud-starter-task-composedtaskrunner/README.adoc
I had to go to work to check this))). It seems --spring.cloud.task.closecontextEnabled=true parameter helped to me. I did about 60 launches and CTR never stucks. Could you explain meaning of this parameter?
About docker. It looks like another bug, because i use docker only for SCDF-local deployment. And then use volume command to move custom tasks to container folder.
I'm glad that this resolved this issue for you. A brief discussion on the parameter can be found here: https://docs.spring.io/spring-cloud-task/docs/current-SNAPSHOT/reference/htmlsingle/#features-lifecycle CTR uses ThreadPoolTaskExecutor to manage splits in the graph, and thus the context remains open beyond the scope of the task. Thus this setting closes the context upon the completion in CTR. As of the release of CTR 2.1 the closeContextEnabled will be set by default.
The other issue is not really a bug with SCDF or CTR.
I will go ahead and close this issue.
As a user, while orchestrating a deeply nested-graph on CTR v2.0.2, I'm noticing that the CTR process continues to run even after successfully executing all the steps with exit-code=0. This behavior is not observed while using v2.0.0 release.
See https://github.com/spring-cloud/spring-cloud-dataflow/issues/2667 for more details.