canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.87k stars 856 forks source link

[docs]: The terms "#include" and "#include-once" are difficult to define in the documentation #4408

Open pjmattingly opened 1 year ago

pjmattingly commented 1 year ago

Documentation request

https://cloudinit.readthedocs.io/en/latest/explanation/format.html https://cloudinit.readthedocs.io/en/latest/explanation/boot.html#network

The term #include seems to be defined here (1), but it is difficult to place in context without reading the entire page. For example, the section includes the sentence:

The file contains a list of URLs, one per line. Each of the URLs will be read and their content will be passed through this same set of rules, i.e., the content read from the URL can be gzipped, MIME multi-part, or plain text.

It's not clear on first reading what "this same set of rules" means in this context. Only reading the entire page shows that this term is associated with information gleaned form datasources. As such a self-contained section defining "#include" may better serve readers.

Then also, the term "#include-once" appears to be undefined in the documentation. The string "include-once" is present in this section (1), but does not appear to be defined. It is only by reading the source (2, 3, 4) that a tentative definition can be found. Then also, it would be helpful to define a use-case for "#include-once", as it's not clear why one would want to use this construction. For example (3) does provide a good definition, but it lacks clarity on when a "one-time-use or expiring URLs" would be useful or needed:

This file will just be downloaded only once per instance, and its
   contents cached for subsequent boots.  This allows you to pass in
   one-time-use or expiring URLs.

The phrasing of this sentence also does not provide a clear indication that providing a "one-time-use or expiring URLs" relates to passing sensitive information to the instance. That is, searching for that term (5) does provide a few links (6, 7, 8) that discuss best practices related to sensitive information, but said links are few and far between in the results of that search. As such, it may be helpful to provide a footnote here to provide context for this sentence. Or a sensible addition might be a reference to a "Further Reading" resources for first-time users.

Finally, the terms #include and #include-once are difficult to search for as they contain the character "#"; This is automatically stripped from Google search inputs. This make discovering information about these terms difficult, and so I believe that special care should be taken to define these terms more explicitly.

1) https://cloudinit.readthedocs.io/en/latest/explanation/format.html#include-file 2) https://github.com/canonical/cloud-init/blob/dec3b65e4634783081be3e191aa88a4d2e46fd02/doc/examples/include-once.txt#L4 3) https://github.com/canonical/cloud-init/blob/dec3b65e4634783081be3e191aa88a4d2e46fd02/doc/userdata.txt#L40 4) https://github.com/canonical/cloud-init/blob/dec3b65e4634783081be3e191aa88a4d2e46fd02/ChangeLog#L4182 5) https://www.google.com/search?q=one-time-use+or+expiring+URLs+cloud+init 6) https://cloud.google.com/storage/docs/access-control/signed-urls 7) https://groups.google.com/g/ec2ubuntu/c/7TpLuyCDcyw?pli=1 8) https://www.nginx.com/blog/securing-urls-secure-link-module-nginx-plus/

holmanb commented 1 year ago

@pjmattingly I agree, this could be much better documented. Thank you for describing in detail the issues with this documentation.

Do you think you have enough context now to suggest specific improvements to the documentation? If so, I'd be happy to help review a pull request and/or help get suggestions incorporated into the documentation. Since you seen to have a solid grasp of what needs improving here, your contribution would be much appreciated!

s-makin commented 1 year ago

Thanks for your feedback @pjmattingly - I think you're absolutely right that we should have at least a section that defines #include and #include-once. I also agree with Brett, if you'd like to make a contribution we would very much welcome that. I think it would be a really helpful change.

As a side FYI, I've been doing a bit of digging into RTD search lately and they don't actually use Google-powered search for their internal search field. This means that the # symbol in the search term shouldn't be a problem, at least in the in-docs search. For Google search/discoverability it's definitely a problem, and I think it would be solved by having a section with the title "#include and #include-once" to get around Google's prompt stripping.