gluster / glusterdocs

This repo contains the source of official Gluster documentation rendered at https://docs.gluster.org
MIT License
352 stars 280 forks source link

Duplication of glossary/terminology overview documents #194

Open mbukatov opened 7 years ago

mbukatov commented 7 years ago

We have glossary with gluster terminology in both Quick Start Guide and Administrator Guide.

The problem is that the content of glossary/terminology overview itself is duplicated in 2 different separate files, with different formatting. This is hard to maintain and keep in sync.

Details

There are 2 files with gluster glossary overview:

While the number of items explained in each file differs greatly:

$ grep '^###' Quick-Start-Guide/Terminologies.md | sed 's/### //' | wc -l
19
$ grep '^\*\*' Administrator\ Guide/glossary.md | sed 's/\*\*//g' | wc -l
45

There is lots of duplication, see the following list of terms which are explained in both files:

$ comm -12 <(grep '^\*\*' Administrator\ Guide/glossary.md | sed 's/\*\*//g' | sort) <(grep '^###' Quick-Start-Guide/Terminologies.md | sed 's/### //' | sort)
Brick
Client
Cluster
Distributed File System
FUSE
Geo-Replication
glusterd
Metadata
Namespace
POSIX
RAID
RRDNS
Server
Trusted Storage Pool
Userspace
Volume

Here is an example how explanation for Brick looks in each file. First is from Quick-Start-Guide/Terminologies.md:

### Brick                                                                       

Brick is the basic unit of storage, represented by an export directory          
on a server in the trusted storage pool.                                        

While this is from Administrator Guide/glossary.md:

**Brick**                                                                       
:   A Brick is the basic unit of storage in GlusterFS, represented by an export 
    directory on a server in the trusted storage pool.                          
    A brick is expressed by combining a server with an export directory in the following format:

        `SERVER:EXPORT`                                                         
    For example:                                                                
        `myhostname:/exports/myexportdir/` 

Expected state

There should be no duplication of information.

We could for example store the content in a single file and then include this single file in both guides without duplicating the content.

humblec commented 7 years ago

@mbukatov Thanks!. Would you like to send a PR on this ?

mbukatov commented 7 years ago

@humblec Do you agree with squashing both documents into single one, which would be included in both guides without duplication?

nvtkaszpir commented 7 years ago

I think squashing is a must, otherwise it may just introduce too much confusion.

How about this:

This way the quick start introduces basic concepts without needing to read tons of text on the start. Also keeping references to the Administration guide glossary file we can merge descriptions from both files. Also first sentence in the glossary could be the same as in Terminologies.

The best idea would be actually converting documentation to RestrucutredText - it provides much better flexibility in managing documentation via linking between articles or including subsections. But this would take some time and would make docs a bit more complex to configure.

humblec commented 7 years ago

@prashanthpai ^^^ Any thoughts here ?

nvtkaszpir commented 7 years ago

Just looked at the docs again after waking up. There's:

So actually there are at least three places, and I think it's one too much :)

prashanthpai commented 7 years ago

The best idea would be actually converting documentation to RestrucutredText - it provides much better flexibility in managing documentation via linking between articles or including subsections. But this would take some time and would make docs a bit more complex to configure.

Agreed. New subprojects use rst. We're stuck with markdown for glusterdocs though, for now.

So actually there are at least three places, and I think it's one too much :)

PRs to remove this duplication are always welcome :)

nvtkaszpir commented 7 years ago

AFAIR sphinxdoc allows converting docs from MarkDown to RestructuredText - I remember doing this. but then it would require changing build process of the documentation. So first step would be making minimal new build script/configs and processing .md files, which can later on be easily converted to .rst If I got some spare time I'll try to make a fork+branch to show this.

prashanthpai commented 7 years ago

AFAIR sphinxdoc allows converting docs from MarkDown to RestructuredText - I remember doing this. but then it would require changing build process of the documentation.

We have visited this option earlier (using pandoc). And in many cases, it did require manual intervention after the conversion and it was too much effort.

nvtkaszpir commented 7 years ago

Yeah, been there ;-) Either way the docs needs reformatting, so it's inevitable.

The only thing I fear is if the docs are already included in some other projects/packages, and they would heavily rely on current setup.

But switching to restructured text would allow easy export to multiple formats like man pages, pdf and so on.

sankarshanmukhopadhyay commented 6 years ago

Alright - so what are the next steps here? Is reconciling the documents and presenting a single glossary the decision?