apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.16k stars 3.57k forks source link

PIP-261: Restructure Getting Started section #19912

Open asafm opened 1 year ago

asafm commented 1 year ago

Background

As PIP-98 explained, Pulsar documentation site today is built like an encyclopedia. New users or existing are overwhelmed by it. Without a clear path per role (developer / DevOps / …), they resort to skim-read or read-it-all to fit the pieces of the puzzle together to form a complete picture of the knowledge they need.

New users usually start with the Getting Started section, which today is mainly focused on starting Pulsar on your development computer in several ways, and then test drive it by publishing and consuming messages using the CLI. It lacks a brief intro into subjects and terminology used throughout that section.

New users, approaching learning a subject for the first time, mainly divided into two types of learning methods:

  1. Reading - some people learn by reading all the material on the subject before trying.
  2. Doing - some people learn by “playing” with it - learn by example.

Today, the people that learn by reading are forced to read the entire Pulsar documentation site and fit the pieces together, which is an immense high bar for newcomers. The ones learning by example don’t have any examples in today’s getting started section and are forced to google their way around many sites until they get their answers.

PIP-98, among other things, explained we should have several guides:

The people that learn by reading, in the future, will use the Developer or Operator guide, as it will be their “book” for it. The people who learn by doing will use the new getting started section we aim to present here, catering to both developers and operators (also referred to as SREs, Infrastructure, DevOps roles).

This PIP is focused on providing a new structure (table of contents) for the Getting Started Guide.

Goal

Table of Content

Sidebar

The sidebar will look like this:

Links

Discussion: https://lists.apache.org/thread/p8d8ks2ygqnq53oxqczxg2mtpf932wpg Vote: https://lists.apache.org/thread/95p5mn873d6d3lsk5kgfks4n6x07x5pq

momo-jun commented 1 year ago

Hi Asaf, thanks for proposing this improvement. Generally it looks good. I support the motivation. Structure-wise, I have a couple of questions.

  1. Is it the TOC inside an all-in-one topic or the TOC of the left navigation? In other words, did you plan to provide an all-in-one Getting Started topic covering three subheadings, or three topics for three specific types of readers? It seems to be one topic in your proposal, which may blur the learning path of different roles. For example, as an operator, will they still go through Consume and Produce messages using the CLI or jumpstart from Deploy helm charts to k8s?

  2. What is the content mapping of existing topics? IIUC, the new structure covers the following three topics. What's your plan to do with Docker Compose?

    1. Consume and Produce messages using the CLI // containing both `getting-started-standalone` and `getting-started- docker`
    2. Get started for Devs
    3. Get started for Ops // containing `getting-started-helm`
asafm commented 1 year ago

Thanks for the feedback @momo-jun. From the looks of it, the doc website allows 2 depth level, that means the left pane will have:

Getting Started Guide

Then each section will have the heading it contains per the TOC (depth level 3 will be H1, ...).

Regarding your second question on what to do with existing Getting Started section. Running pulsar locally and Docker is included in the CLI section. running pulsar in K8s is included in Operator Getting Started

"Run a Pulsar cluster locally with Docker Compose" is actually missing How about to tackle this we'll change the TOC to:

momo-jun commented 1 year ago

Thanks for the further explanation. Adding a branch mode for Docker Compose looks good to me.

Now I only have one concern - the structure of the TOC is not MECE and might be difficult to understand.

And logically, Consume and Produce messages using the CLI is part of Dev Getting Started, while operators don't have to go through it - I'm afraid the naming cannot help them get to this point.

asafm commented 1 year ago

Your feedback is much appreciated and straight to the point.

How we name the main headings as below?

  1. Introduction to Pulsar using CLI
  2. Introduction to Pulsar using sample applications
  3. Introduction to Pulsar using operational scenarios
Anonymitaet commented 1 year ago

Hi @asafm

Thanks for your awesome proposal. The real-world examples are great additions to the docs!

Issues

While there are some issues in the current proposal:

  1. The learning paths of 3 roles are blurred.

    If we put all the topics (as below) into a single Get started, all roles will read them all by sequence, which means a clear learning path is not designed in real.

    Introduction to Pulsar using CLI
    Introduction to Pulsar using sample applications
    Introduction to Pulsar using operational scenarios
  2. Headings are lengthy.

    Main headings are too long.

    For example, for Introduction to Pulsar using CLI, actually users care little about the method (whether it's CLI or API) to produce msg. What they want is to try and get a successful result (with whatever the method) in a minimal time. So "method" can be hidden in headings to save space since headings should show the "keypoint" and be "concise" as much as possible.

Solutions

To resolve the issues above, I would suggest that:

  1. Create 3 guides for 3 roles respectively.

  2. Show 3 guides on the doc landing page to provide specific paths for different roles. Users do not need to wander on the Pulsar site or Google around to find suitable docs.

  3. Make 3 guides as subpages of https://pulsar.apache.org/docs since:

Benefits

This solution:

(1) Highlights the roles and gives them what they need clearly. No missing or duplicates (MVP).

(2) Makes short headings possible.

Besides, for the common docs (e.g., concepts, references) which should be reused, we can link them richly in the 3 guides.

image

Examples

image
asafm commented 1 year ago

Thanks for the detailed reply!

Regarding the suggestion of moving the getting started for each role into it's own sub-section of a bigger guide (developer, operator):

I was thinking about it. My big plan was indeed to have 2 additional guides, a Developer Guide and an Operator Guide. If we zoom in, for example on the developer role, the two (getting started , developer) serve a completely different purpose:

So, when I think about it, in my opinion it's confusing to have in the same guide, two contradictory sections: We'll have a Getting Started section which is basically a tutorial. So the people that like to read first do later, will be confused - "so we're suppose to get started here, but what's going on? I see code here, no no no. I want to understand first. what's going here?". On the other hand, the people like to do first, read later, will not search inside a Developer Guide the getting started. For them, a Developer's Guide is big scary book, filled with way too many details. If you ask them, all they want is tutorials, from the getting started ones, to more complicated ones. So I imagined having a section in the docs named Tutorials, that contains exactly that, grouped by role (developer, operator).

So from that perspective I prefer to have: Getting Started Developer Guide Operator Guide

Regarding second suggestion of having sub-pages of docs. You mean each guide will have their own "doc site"? I think it depends if Docosaurus allow more than 2 depths in the left side bar. I personally like all docs to be in a single location - I don't like to jump around between sites. That's my personal preference.

Regarding

If we put all the topics (as below) into a single Get started, all roles will read them all by sequence, which means a clear learning path is not designed in real.

Why do you think that if the side bar has: Getting Started

then people will read them one after another? If I'm a developer, I would naturally ignore operational scenarios, right? Why do I care?

I do agree the titles are too lengthy. Maybe we can try:

Anonymitaet commented 1 year ago

Hi @asafm

Thanks for your detailed explanations!

I understand your points, and I'm trying to make the learning path more clear, simple, and direct for each role.

Reasons for designing 3 guides

1. Give prominent directions for users

Suppose that you're at a fork in the road, it's most clear for you to choose one way if the sign indicates the direction.

image

This is the same for doc users. Whatever the user archetypes ("doing" or "learning") are, the most important thing is they're seeking solutions to resolve issues based on their roles. The roles are signs.

So if we design the doc IA as below, users just need to choose one way based on their roles and finish the left journey. No other stuffs they need to take into consideration. It's simple, clear, and direct.

image

2. Provide required minimal info for users

If we put all the topics (as below) into a single Get started, all roles will read them all by sequence, which means a clear learning path is not designed in real.

Clarification: "read them all by sequence" means " all roles need to glance over all 3 headings (even though they are just interested in and click one later)" rather than "read them (docs) one after another".

But if we design 3 guides respectively:

In this way, we provide the required info for each role at a minimal amount. Users will like it because it:

D-2-Ed commented 1 year ago

I wholeheartedly agree that the role-based learning paths are a great idea for future iterations of the documentation.

In the short-term, it's a quick win to incrementally update the GSG. I suggest we title the first one "quickstart" and also link to it from GS menu on home page. And then call the others what they are: tutorials. WDYT?

Getting Started

tisonkun commented 1 year ago

Hi @asafm! Thanks for starting this thread.

I reviewed this proposal in two aspects.

Content

For three journeys in the proposal, we have contents for two of them:

The closest pages for getting started with operations are under the "Administration" chapter https://pulsar.apache.org/docs/2.11.x/administration-zk-bk/, while we don't have a portal page or getting started page.

Structure

The "Get Started" chapter is located on the top of the sidebar, and it should be fine.

The "Tutorial" chapter is somewhat hard to find, so we may set up some links or refactor the content and merge it into Get Started chapter.

For reference, grpc-java has a Quickstart page to run the very simple demo and then a "Basic tutorial" page to talk about every basic concept.

The operations getting started content needs to write and we may prepend it as the first item of "Administration" chapter.

asafm commented 1 year ago

Ok. I'll try to combine the suggestion made above.

How about we'll have 3 headings as below:

The quick start will contain the content I've placed under "Consume and Produce messages using the CLI". The main idea: give any role the ability to "feel" Pulsar locally, using the CLI.

The Developer Guide will be, over time, a comprehensive guide, like a book, to learn Pulsar targeted at Developers. It's Getting Started section will contain what placed under "Developer Getting Started", mainly aimed at people who wish to learn by "hand" as I explained in previous comments and in the PIP.

Same with The Operator Guide, but for Operators (DevOps).


@tisonkun @D-2-Ed I don't like to call it the Getting Started section a tutorial, although it is built as one. People expects a Getting Started section to look like a tutorial. I do think in the future we can have a dedicated Tutorial subsection for each guide.


@tisonkun

For three journeys in the proposal, we have contents for two of them:

Getting started with CLI: https://pulsar.apache.org/docs/2.11.x/getting-started-home/ Getting started for sample applications: https://pulsar.apache.org/docs/2.11.x/how-to-landing/

The CLI journey - I plan to take the content from all three, but I simply structure it differently:

Based on the revised solution I wrote, it will be located under Quick-start. Under it, you'll have two steps: (1) Starting Pulsar Locally (2) Publish and Consume messages using the CLI

The (1) will contain subsections to start it locally using binary downloaded, or docker, both in stand-alone mode, or using docker compose as a complete cluster (including ZK, BK). Once you have a cluster up and running, you can continue to step (2) and use the CLI to publish and consume messages.

Today, it's copied and pasted cross each flavor of starting Pulsar.

So in summary, I plan to re-use the existing content, and mainly restructure it.

Regarding the Developer Guide / Getting Started. You mentioned "https://pulsar.apache.org/docs/2.11.x/how-to-landing/". This gives you a broken up tutorial (not one with steps). Most importantly, it is not using code. Only via command line. The developer getting started section aimed to have a working application. Actually, 2 of those, matching the most popular use cases for pulsar, as detailed in the PIP.

I hoped I answer all of your comments @tisonkun @D-2-Ed @Anonymitaet. Would love to hear your feedback on the suggestion I wrote in the beginning of the comment.

Anonymitaet commented 1 year ago

@asafm @D-2-Ed thanks for your explanations!

Record some discussions here for further learning:

asafm commented 1 year ago

I've updated the PIP according to all comments.

tisonkun commented 1 year ago

I've updated the PIP according to all comments.

Thanks for your updates @asafm! I believe it's good to go for a vote now.

hangc0276 commented 1 year ago

@asafm Thanks for your proposal.

From the engineering side, the new document structure meets the beginner's reading behavior. I like reading by doing and understanding the key concept in practice. The getting started section works like a book with a real-work example to show which case Pulsar can work for and how it works. Our current website divides the context into several parts and it's a little hard for beginners to link them together in the first reading.

For the discussion of creating clear 3 guides for 3 roles, I think it may not be so important in the Getting Started section. The Getting Started section aims to provide the basic knowledge of Pulsar for all the roles. In fact, those knowledge is the basic part of the 3 roles.

For the concept part, I suggest doing some comparisons between different concepts, such as different subscription types.

This example will include a brief explanation about: Partitioned topic Failover subscription type Key in message Key-shared subscription Scaling consumers Correctly acknowledging key-shared subscription Correctly acknowledging failover subscription

Overall LGTM. I think we can start the vote.

github-actions[bot] commented 1 year ago

The issue had no activity for 30 days, mark with Stale label.

asafm commented 1 year ago

This will be in motion soon

github-actions[bot] commented 1 year ago

The issue had no activity for 30 days, mark with Stale label.

asafm commented 5 months ago

My plan - TOC - for the documentation site:

Quick-start Guides

Developer Guide

Operator Guide

Contributor Guide