d2iq-archive / dcos-flink-service

11 stars 17 forks source link

fixes #35 - Configure upload directory and reference a local persistent volume #36

Closed ANierbeck closed 7 years ago

EronWright commented 7 years ago

@ANierbeck The marathon template doesn't configure the task manager container, only the job manager. Unless I am mistaken, this fix would benefit only the job manager.

ANierbeck commented 7 years ago

ok ... do you have a pointer on where to find that?

EronWright commented 7 years ago

I mean, the issue cannot be fixed without some changes to Flink itself. It is Flink that creates the Task Manager containers, and so would need to acquire volumes as Marathon does. Sorry to dissuade you but the procedure is rather complicated.

I think we should close this PR now.


From: Achim Nierbeck notifications@github.com Sent: Sunday, July 30, 2017 10:53 AM To: mesosphere/dcos-flink-service Cc: Eron Wright; Comment Subject: Re: [mesosphere/dcos-flink-service] fixes #35 - Configure upload directory and reference a local persistent volume (#36)

ok ... do you have a pointer on where to find that?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mesosphere/dcos-flink-service/pull/36#issuecomment-318917978, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABsXnqTbuU__9fQctaQCcypbTeFzuo1Hks5sTMMrgaJpZM4OnjGc.

ANierbeck commented 7 years ago

ok ... might be true, but the issue already is an issue for the job manager, that one also might run into the issue of no space left on device ... especially when uploading more then one gigantic uber-jar

EronWright commented 7 years ago

@ANierbeck I doubt the taskmanager.tmp.dirs option would help with general disk space issues. Volumes increase the complexity so I'm inclined to close this, in favor of an improvement to Flink to allocate disks for the Task Manager.

To briefly state what would be involved: Flink acts as a Mesos framework. As a framework it must reserve disk resources and create volumes as part of offer handling. Marathon makes this look easy but Marathon isn't involved here. Flink uses Netflix Fenzo under the hood to process offers, and it may require an enhancement to be useful here (especially if multiple disks per TM are desired). See Netflix/Fenzo#83.

joerg84 commented 7 years ago

@EronWright the uploadDir is on the jobmanager which is not started by Fenzo, or?

EronWright commented 7 years ago

It is true that the JM stores the uploaded JAR, then each TM downloads it to their respective containers.