Closed giomasce closed 11 years ago
I started addressing the issue, updating Documentation.tex. The problem is, it contains a lot of duplications (for example, compared with README.md) and it is a poor location for documentation, since many users probably will not bother compiling it to a pdf.
My suggestion is to delete the parts of Documentation.tex that are step-by-step guides, and integrate them in README.md, if they are essential instructions, or in some txt file in doc/. The intent is to keep in Documentation.tex only high-level motivations that help users understand the philosophy, but not required to run CMS; thus delegating more practical instructions to txt files.
This also goes towards a defrag of the documentation. What do you think about moving also the troubleshooting wiki into doc/?
Finally, what do you think is missing from the documentation? I plan to add explanation of the task and score types (including parameters), of the tokens mechanism, and expand a bit what to do when the database schema is different from the expectations of CMS. Any other ideas?
If you want to take over some of these you are welcome :)
I forgot: I also propose to delete the description of the RPC interface in Documentation.tex: it is too far behind the implementation that it makes more sense to write a better one from scratch instead of improving the current.
I propose to move all the documentation to the GitHub wiki (except for the README.md).
Pros:
Cons:
Ciao.
Il 18/12/2012 10:16, Luca Wehrstedt ha scritto:
I propose to move all the documentation to the GitHub wiki (except for the README.md).
I agree. Although:
Pros:
- it'll be paged, hypertextual, non-linear, searchable, etc.
That's not necessarily a pro: it happened to me more than once that I have to understand documentation that was written in hypertextual, non-linear fashion; pity that was I was searching was a perfectly linear and non-hypertextual explanation of how that thing worked. I didn't want to read billions of pages and links and then reconstruct the information I required.
That's to say that users also need a plain and linear introduction to the system, otherwise they find it difficult to understand anything else.
- it'll be easier for users to find (hopefully...)
- it'll be easier to consult (users won't have to clone the repo and compile the .tex file)
Well, we could publish a compiled version of it...
- everyone will be able to edit it, thus external contributions will be very easy (and extremely welcome)
- it'll still be a (separate) git repo, so we will keep all history and we will be able to clone it and edit the pages offline
Cons:
- we have to rewrite all existing documentation in on of the markup languages supported by GitHub (Markdown, reStructuredText, etc.)
- the wiki repo will be separate from our main code repo, this means that we will have to manage two repos that could go "out of sync" (not sure of the implications of this... could even not be a con, after all...)
Well, they're heavily out-of-sync even now when they stay in the same repository. I wouldn't bother so much about this point.
Giovanni Mascellani mascellani@poisson.phc.unipi.it Pisa, Italy
Web: http://poisson.phc.unipi.it/~mascellani Jabber: g.mascellani@jabber.org / giovanni@elabor.homelinux.org
Il 18/12/2012 08:37, Stefano Maggiolo ha scritto:
I started addressing the issue, updating Documentation.tex. The problem is, it contains a lot of duplications (for example, compared with README.md) and it is a poor location for documentation, since many users probably will not bother compiling it to a pdf.
If users don't want to "bother" compiling things, they can also avoid "bothering" using CMS.
My suggestion is to delete the parts of Documentation.tex that are step-by-step guides, and integrate them in README.md, if they are essential instructions, or in some txt file in doc/. The intent is to keep in Documentation.tex only high-level motivations that help users understand the philosophy, but not required to run CMS; thus delegating more practical instructions to txt files.
If you want to switch the docs to MarkDown, that's ok for me, but I really would prefer having everything in the same place. I think we can just drop LaTeX and put also high-level motivations in MarkDown.
Finally, what do you think is missing from the documentation? I plan to add explanation of the task and score types (including parameters), of the tokens mechanism, and expand a bit what to do when the database schema is different from the expectations of CMS. Any other ideas?
It's more or less what I would have thought of.
Giovanni Mascellani mascellani@poisson.phc.unipi.it Pisa, Italy
Web: http://poisson.phc.unipi.it/~mascellani Jabber: g.mascellani@jabber.org / giovanni@elabor.homelinux.org
- it'll be paged, hypertextual, non-linear, searchable, etc.
That's not necessarily a pro: it happened to me more than once that I have to understand documentation that was written in hypertextual, non-linear fashion; pity that was I was searching was a perfectly linear and non-hypertextual explanation of how that thing worked. I didn't want to read billions of pages and links and then reconstruct the information I required.
That's to say that users also need a plain and linear introduction to the system, otherwise they find it difficult to understand anything else.
Good point. Note that, nevertheless, we would still have the README, which is a linear and very basic introduction to the system. I hope that, after reading it and experimenting a bit with the system, a user will already want to get details on specific topics and will not mind having a non-linear documentation. (BTW, we could also provide a "guided path" through the documentation, "linearizing" it). I should have put the "searchable" point by itself, since it doesn't fit with the others and IMO it's one of the most important ones...
If users don't want to "bother" compiling things, they can also avoid "bothering" using CMS.
I don't quite agree: a user may want to read the documentation just to see if it's worth to "bother" using CMS...
Any other ideas?
We could describe the YamlImporter format, since it seems to be the one people use most. Actually, I'm not sure if we want to "promote" it further, since it's rather ugly and I think we should change it ASAP (see #71) and given that we already have a great example contest (at cms/con_test) written in that format...
I agree with having a single place for the documentation (maybe excluding the readme). I'd be happy if this place was the wiki, since I like having links when needed. The problem with the wiki is that to my knowledge there is no easy way to tell to github "I want to see the wiki as it appeared when releasing version 1.0". With the current model is essential to expose different documentation for each version. We could implement it by hand, but that's the reason why I did not propose the wiki in the first place. Did you consider this (or is there something I don't know)?
Yes, I think you're right: it's not possible to navigate a certain "snapshot" of the wiki (it's possible to see old versions of a page, as with every wiki software...). I don't know if other wiki platforms support that...
Yet, I don't think it would be difficult to implement this by hand... since the wiki is just a collection of files organized in directories we would only need to create a new directory for each version, copying the contents from the previous version.
If also Giovanni agrees with this, I'll start migrating stuff to the wiki, first with placeholders, then with actual content (so if someone is willing to help me we can split the work of writing the new stuff).
As a plus, in this way we can offer up to date documentation for version 0.9, which is good I guess.
Ciao.
Il 18/12/2012 18:30, Stefano Maggiolo ha scritto:
If also Giovanni agrees with this, I'll start migrating stuff to the wiki, first with placeholders, then with actual content (so if someone is willing to help me we can split the work of writing the new stuff).
I already agreed! :-)
About the multiple versions: for a user it is far better to have just one document that explains the difference among versions only when such differences exist. Having to compare different (mostly identical) revisions of the same document is a terribly difficult and prone to errors activity. I see no point into forcing our users into that.
Giovanni Mascellani mascellani@poisson.phc.unipi.it Pisa, Italy
Web: http://poisson.phc.unipi.it/~mascellani Jabber: g.mascellani@jabber.org / giovanni@elabor.homelinux.org
About the multiple versions: for a user it is far better to have just one document that explains the difference among versions only when such differences exist. Having to compare different (mostly identical) revisions of the same document is a terribly difficult and prone to errors activity. I see no point into forcing our users into that.
I don't see why a user should compare the different versions: I suppose he/she will consult just the one corresponding to the version of CMS he/she is using, or not? And, actually, having multiple, completely separated and "parallel" versions of the documentation, one for each (major) release, seems to me a common approach. Take the SQLAlchemy documentation as an example: the first thing to do when you want to see documentation is to choose the version.
I agree with Luca with the additional idea that when we release a new version we attach a change log including the user-facing changes and the instructions on how to move data to the new version.
Il 18/12/2012 19:51, Luca Wehrstedt ha scritto:
I don't see why a user should compare the different versions: I suppose he/she will consult just the one corresponding to the version of CMS he/she is using, or not?
If he wants, he can do so. But sometimes he may want to recall what changed from a version to another (why doesn't this contest work anymore? Why does another random guy use a different configuration? Why does another random guy get different results?), or just have a visual impression of how changed from a version to another in a specific section of the manual (without reading all the changelogs from the beginning on).
In my idea, parts that refer to previous releases are clearly marked, so there is no risk of confusion.
And, actually, having multiple, completely separated and "parallel" versions of the documentation, one for each (major) release, seems to me a common approach. Take the SQLAlchemy documentation as an example: the first thing to do when you want to see documentation is to choose the version.
Which means that nobody of us can understand whether something is going to break if we use a different version of SQLAlchemy. If we need to debug a version-related problem, we have to compare two different documents and "manually" computer the meaningful differences. If differences are clearly documented, everything is easier.
However, if you really do not like this idea, let's stick to yours. But I really don't expect changelogs and release notes to replace the information we're losing.
Giovanni Mascellani mascellani@poisson.phc.unipi.it Pisa, Italy
Web: http://poisson.phc.unipi.it/~mascellani Jabber: g.mascellani@jabber.org / giovanni@elabor.homelinux.org
Actually, once I was debugging an alchemy related problem and when it appeared a version mismatch problem, the first thing I looked was their changelogs, that clearly showed that my assumption was not true anymore in the new version. If something break, the changelogs/release notes tell you why it broke and (if short enough) how to fix it; the documentation of the new version tells you the new proper way of doing stuff. I don't think we'll lose information.
Ciao.
Il 19/12/2012 01:04, Stefano Maggiolo ha scritto:
Actually, once I was debugging an alchemy related problem and when it appeared a version mismatch problem, the first thing I looked was their changelogs, that clearly showed that my assumption was not true anymore in the new version. If something break, the changelogs/release notes tell you why it broke and (if short enough) how to fix it; the documentation of the new version tells you the new proper way of doing stuff. I don't think we'll lose information.
There is no point into discussing things that are mostly a matter of tastes. If you and Luca prefers having different copies of the docs, let it be.
But I'd just like to point out that maybe if SQLAlchemy had all the docs in just one point instead of three, maybe you would have never had anything to debug...
Giovanni Mascellani mascellani@poisson.phc.unipi.it Pisa, Italy
Web: http://poisson.phc.unipi.it/~mascellani Jabber: g.mascellani@jabber.org / giovanni@elabor.homelinux.org
I think we can close this after pushing https://github.com/stefano-maggiolo/cms/commit/e78ed922f2e018974db75afa04c08e6258b587bd and https://github.com/stefano-maggiolo/cms/commit/ae94d58e321233b7fbeee18ac4d638ac3834a62b we can close this. Comments? (In particular, for having removed most content from the README.)
Apart from the comments I wrote on the individual commits, I ask you to wait a moment because I still have some fixes to do (mostly spelling and grammar, and a couple of changes to the dependency list).
There are still files inside the "doc" directory (note that I put all the Sphinx documentation inside the "docs" directory because it was more "common" among other projects but also to see clearly how much "old" documentation we still have to migrate (or simply delete) by looking at how many files remain in the "doc" directory).
I think that this issue cannot be closed until the "doc" directory is no more.
So, let's take a look at the contents of "doc":
.gitignore
Contains just LaTeX related rules: can be deleted.32bits_sandbox_howto.txt
It still applies to CMS (even if its a little outdated, as you can see by the names it uses for "recent" Ubuntu releases) but it won't anymore as soon as we switch to the new sandbox. We can consider including it into the Sphinx documentation (to drop it immediately after v1.0) or dropping it now. I'm in favor of the latter.cms_netboot.txt
It still applies to CMS, but the text itself seems little more than a draft. If we want to include it into the Sphinx docs someone (@giomasce?) needs to complete it. If we don't want (or can) to do it now we can delete the file and recover it from git's history when we will want to include it in the official documentation (I don't like keeping half-finished files around).discrepancies_2011.txt
I think we can remove it, since many things have been implemented in CMS or they're not CMS's business or the
rules have been rewritten (at IOI 2012) to make CMS adhere to them.license.txt
I think this needs to be kept, but I'm not sure where. Neither "doc" nor "docs" seem a proper place. Perhaps it could go in the root of the repository... any other suggestions?sandbox.txt
I don't see the use of this file: I suggest to delete it.submission_scoring_time_proposal.txt
Since it was written a few things happened (AWS became able to show a ranking, extra_time was implemented) that made some of its assumptions invalid. The proposals that still apply are, IMO, the introduction of a proper analysis phase and a consequent change (in SS and all other code that computes scores) to consider only submissions inside the "contest" time-frame. These proposals could be put inside appropriate issues to keep better track of them.task_versioning_proposal.txt
Implementation for this feature is underway (making this file superfluous), even if it'll be merged after 1.0. Yet, I think we can't delete this file now.Let me know what you think ASAP: we want to release 1.0 and this is the last bug standing!
PS: we had some documentation in the GitHub wiki... since we have now definitely moved to Sphix we can remove that too.
I'm not a core dev, but here are my 2 euro cents anyway :)
32bits_sandbox_howto.txt It still applies to CMS (even if its a little outdated, as you can see by the names it uses for "recent" Ubuntu releases) but it won't anymore as soon as we switch to the new sandbox. We can consider including it into the Sphinx documentation (to drop it immediately after v1.0) or dropping it now. I'm in favor of the latter.
We could add a line to 1.0 docs "Using a 32-bit sandbox on a 64-bit system is not supported. However, you can read some notes on how this might be done here: ". cms_netboot.txt It still applies to CMS, but the text itself seems little more than a draft. If we want to include it into the Sphinx docs someone (@giomasce?) needs to complete it. If we don't want (or can) to do it now we can delete the file and recover it from git's history when we will want to include it in the official documentation (I don't like keeping half-finished files around).
I'd vote for just opening an issue "maybe document net boot approach", with a link to the file at that revision (so it's not lost deep within git forever). discrepancies_2011.txt I think we can remove it, since many things have been implemented in CMS or they're not CMS's business or the rules have been rewritten (at IOI 2012) to make CMS adhere to them.
Ack. license.txt I think this needs to be kept, but I'm not sure where. Neither "doc" nor "docs" seem a proper place. Perhaps it could go in the root of the repository... any other suggestions?
I'd suggest moving it to the root as LICENSE.txt, and moving the existing AGPL text into LICENCE-AGPL3.txt. submission_scoring_time_proposal.txt These proposals could be put inside appropriate issues to keep better track of them.
Ack. task_versioning_proposal.txt Implementation for this feature is underway (making this file superfluous), even if it'll be merged after 1.0. Yet, I think we can't delete this file now.
I think we can delete it. It's not relevant to the 1.0 docs. For 1.x, the only thing that it mentions which doesn't exist is ES priorities for active vs hidden datasets. We can just make that a github issue (along with actually documenting the task versioning implementation).
I more or less agree with everything, apart from the general principle of "everything must be either in sphinx or in an issue". I think the proper place for articulate proposal of not-trivial features is in the repo; also, I don't like having very specialized documents in sphinx (as the netboot or the 64bit instructions) - I like that those info stay around for people who need them, but they shouldn't clutter the main docs. Maybe in appendices, if sphinx supports them?
The general principles were actually "all documentation has to be in Sphinx to be easy reachable and not dispersed" and "all proposals have to be in a place where it's easy to see, discuss and keep track of them (i.e. a thread on the mailing-list or, much better, an issue on GitHub)" (a file in the repository doesn't satisfy any of these conditions).
This is what I'll do now:
.gitignore
Delete it.32bits_sandbox_howto.txt
Move to docs
and put a link to it in the Sphinx documentation. Delete immediately after 1.0 (and remove the link).cms_netboot.txt
Delete it but open an issue to ask someone to finish it and include it in the Sphinx documentation (attaching a link to the file version at the current commit).discrepancies_2011.txt
Delete it.license.txt
Rename $REPO/LICENSE.txt
to $REPO/LICENSE-AGPL3.txt
and move $REPO/doc/license.txt
to $REPO/LICENSE.txt
.sandbox.txt
Delete it.submission_scoring_time_proposal.txt
Open issues to keep track of the remaining proposals.task_versioning_proposal.txt
Delete it.As for the appendices in Sphinx (to hide all the very technical documentation), in what does your idea of them differ from a simple chapter (as we do now for all other documentation)?
Done.
I'm retargeting this issue to milestone 1.x to remember to delete docs/32bits_sandbox_howto.txt
when we introduce the new sandbox, but apart from that this issue is virtually resolved.
It seems to me that this issue has already been fixed.
CMS is severely lacking documentation. What is available is usually out-of-date. Particularly, TaskTypes and ScoreTypes are not documented at all.