freifunk-berlin / falter-repo_builder

DEPRECATED: Build a package feed with dockerized OpenWrt-SDK
3 stars 3 forks source link

build_all_targets: limit concurrent jobs #11

Open Akira25 opened 3 years ago

Akira25 commented 3 years ago

This commit adds some code that limits the amount of docker containers that run in parallel. This will hopefully fix the nondeterministically missing files that we have seen in the packagefeed.

Signed-off-by: Martin Hübner martin.hubner@web.de

PolynomialDivision commented 3 years ago

This will hopefully fix the nondeterministically missing files that we have seen in the packagefeed.

Why?

Akira25 commented 3 years ago

I think that thousands of containers spawned at the same time do some strange racing stuff on the buildbot workers. @simzon already mentioned that, when we had this code pre-merge. I suppose that limiting the amount of parallel containers to something reasonable will avoid that issue that packages miss in the finished builds.

Even without that its a good idea to limit the amount of parallel containers. Some unknowing user might start the script in parallel without knowing how much processing power it will take. When I tested the parallel option on my machine, it was unresponsive due to the too high workload.

PolynomialDivision commented 3 years ago

Even without that its a good idea to limit the amount of parallel containers. Some unknowing user might start the script in parallel without knowing how much processing power it will take. When I tested the parallel option on my machine, it was unresponsive due to the too high workload.

That is a very good reason.

I suppose that limiting the amount of parallel containers to something reasonable will avoid that issue that packages miss in the finished builds.

I would argue that it can still happen that we have a race codition since we have concurrent build processes. But I don't have any other good solution for this problem. Maybe also our architecture list is wrong?

Akira25 commented 3 years ago

I would argue that it can still happen that we have a race codition since we have concurrent build processes. But I don't have any other good solution for this problem. Maybe also our architecture list is wrong?

That could apply too. I'm not sure on that though. Maybe this will fix that problem too, maybe not. Probably we should observer that topic anyway.

PolynomialDivision commented 3 years ago

I would argue that it can still happen that we have a race codition since we have concurrent build processes. But I don't have any other good solution for this problem. Maybe also our architecture list is wrong?

That could apply too. I'm not sure on that though. Maybe this will fix that problem too, maybe not. Probably we should observer that topic anyway.

But I don't understand it why we have race condition at all. :/ Maybe we have some architecture doubled? I will look into this build script again.

pmelange commented 3 years ago

I'm not sure that this is the correct way to go. Since there will be multiple simultaneous executions of build_feed accessing the file /tmp/count.tmp, then there could be race conditions. One option would be to lock the file before each use, and to unlock it afterwards. Without a lock, NACK

pmelange commented 3 years ago

Also, I just noticed that COUNT_FILE is not defined anywhere in build_feed.

Also, calling build_feed automatically will make a senseless temp file

Also, if two or more build_all_targets are running on the same machine, then they will be competing for the temp file.