apache / hop

Hop Orchestration Platform
https://hop.apache.org/
Apache License 2.0
985 stars 354 forks source link

[Task]: Document how Pipelines using Merge Join can run into deadlock #4514

Closed Adalennis closed 2 weeks ago

Adalennis commented 2 weeks ago

What needs to happen?

See https://github.com/apache/hop/issues/3740

Issue Priority

Priority: 3

Issue Component

Component: Documentation

Adalennis commented 2 weeks ago

.take-issue

zf18634362654 commented 2 weeks ago

I also found that if there are two upstream transforms, they use the getFromRowset method. If two identical getFromRowset transforms appear, the pipeline will hang when the data reaches 20000 rows. image

hansva commented 2 weeks ago

This has been covered by #3740. This is valid with almost all transforms, we work in a micro-batch with buffered rows. if the buffers get filed without a way to push rows down the stream it will deadlock. Documentation has been added https://hop.apache.org/manual/next/how-to-guides/avoiding-deadlocks-when-using-stream-lookup.html