jayunit100 commented 7 years ago

We need to add a top level scheduling_overview.md file, per https://github.com/kubernetes/kubernetes/issues/33188.

Alternatively we can update the docs in community, but i assume that we at this point want the scheduler to be in the user docs given the increasing complexity/configuratbility w/ affinity and so on.

Suggested format: (if we agree on this, ill submit the PR, ihave it mostly set up now, it pulls from the implemented design proposals, diagrams and other tidbits that are spread around).

@timothysc @ConnorDoyle @davidopp PTAL the format below, if SGTM, i will issue the PR in the morning. I have a (mostly) complete overview for end users which we can start with for the 1.6 stabilization release.

The Kubernetes Scheduler

The scheduling algorithm

Predicates and priorities policies

Scheduler extensibility

Predicates and Priorities in detail

Ranking the nodes

Advanced Topics:

Affinity

Controller Conflicts and Pod Equivalence

Rescheduling

Taints and Tolerations

davidopp commented 7 years ago

@jayunit100 Can you clarify what the target audience is? Users? Admins? Developers? Today we have the following docs (unless I am forgetting some):

User documentation

https://kubernetes.io/docs/user-guide/node-selection/ (includes pod and node affinity/anti-affinity)
https://kubernetes.io/docs/user-guide/compute-resources/

Admin documentation

https://kubernetes.io/docs/admin/multiple-schedulers/

Developer documentation

Design docs

[ I have intentionally omitted the rescheduling documents because we are going to change that soon and I think the current state of affairs and terminology is super-confusing. But anyway, the current admin guide documentation for it is here. ]

The current state of affairs for documentation is not ideal, but I would suggest that the two principles we should use in changing any of the documentation is (1) keep the "what users and admins need to know" stuff separated from the "what developers need to know" stuff, as it is today -- don't overwhelm users and admins with stuff they don't need to know (e.g. inner workings of scheduler), but it's fine to give a link to the more detailed developer docs (just as its fine to link from developer docs to design docs) (2) build on the existing docs rather than creating more

cc/ @kubernetes/sig-docs-misc

jayunit100 commented 7 years ago

Thanks for linking all of those,

yes, much of that information is what I was planning on consolidating. I actually had lost track of a few of those so thanks again :)... but maybe you're right consolidation isn't the answer, links are better for some cases.
audience == users, so references on how the scheduler works, without diving too deep, should be consolidated in the docs. Right now somewhat scattered with node-selection sort of being a good starting point. However, lets remember that admins and developers are also a very influential and important set of users :).... So just having a thin doc with a few interesting user-facing features probably isn't enough.
goal = canonical reference point, but not necessarily a monolithic doc file. Parts of design docs (like affinity, multiple sched) should be upleveled or at least linked from a canonical scheduling page in the top level docs. A user, advanced user, admin, or even a developer should ideally have a single place where they can "start" diving into a problem. At the very least, this reduces bit-rot in the docs. At best, it makes sure there is a consistent way to upgrade, review, and ensure comprehensive documentation. So, linking out from a canonical starting point probably satisfies your suggestion of not going too deep, while also solving the issue posed in the parent 1.6 milestone: consolidation so that scheduler behaviour can easily be read about by users without needing to search various different repos/websites.
predicate and priority workflows should be user facing documentation i.e. the ones in the devel docs should be exposed to the user, since after all we expose the rollup of those via kubectl and they are the most common reason that a user fails to get an app running + users can configure predicate behaviors.
community/devel shouldn't exclusively have info that users might reasonably want/need since scheduler extensions are a user feature, I guess those should be exposed to in the user docs (as opposed to hidden in a separate repo. I'm under the assumption that community/devel docs not guaranteed to be maintained for end users and so there needs to be a periodic movement into top level docs once features are hardened. If not, community/Devel will be the de facto standard for granular docs and that may not be very convenient.

jayunit100 commented 7 years ago

assuming we are mostly in agreement ... only question is if we also agree that its good to have a canonical starting doc for the scheduler (assuming probably given the comment above "its fine to give a link...") So, the

outstanding question is: What should be the canonical reference point for the scheduler docs?

docs: node-selection.md (we could rename it)
devel: scheduler.md (we could promote it to the docs repo)
other candidates ?

and then, once we decide the canonical reference page, how should we structure it to not overwhelm users?

Maybe we can have something like this?

user doc: scheduler.md + node-selection + scheduler_algorithm + extender.md
- (out-link) compute-resources
- (out-link) multiple-schedulers
- (out-link) pod affinity
- (out-link) taint-toleration-dedicated

That basically will combine the basics of the scheduler with links to the granular subtopics/developer docs/proposals.

jaredbhatti commented 7 years ago

Is this part of a particular release? Is there existing content for it? Have you connected with anyone in the docs team?

jayunit100 commented 7 years ago

yeah we all do have the goal of 1.6 release doc clean ups. The only real question is how to handle deduplication and sprawl across the many repos that we have now.

After David's earlier feedback, My main suggestion is a canonical source of truth which links out to other sources. Just want to make sure we are on the same page before we start the work to cleanup or consolidate as necessary

steveperry-53 commented 7 years ago

Adding @chenopis and @devin-donnelly who are working on the structure of the Kubernetes docs.

chenopis commented 7 years ago

Yeah, I'm all for a canonical source of truth. Let's see where @devin-donnelly thinks that should be.

jayunit100 commented 7 years ago

FYI from sig-scheduling meeting, we've all agreed canonical single integration point is the first step. im pulling the data together now for that.

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle rotten /remove-lifecycle stale

fejta-bot commented 6 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

kubernetes / website

top level scheduling doc #2276

The Kubernetes Scheduler

The scheduling algorithm

Predicates and priorities policies

Scheduler extensibility

Predicates and Priorities in detail

Ranking the nodes

Advanced Topics:

Affinity

Controller Conflicts and Pod Equivalence

Rescheduling

Taints and Tolerations