Provide API documentation

zgyarmati commented 7 years ago

Partially depends on #167. Currently the API of the lib documented only in the src/lib/librnp.3 file and don't have any autogenerating infrastructure in place. Instead of this we need a source-code comment based, autogenerated API documentation. For this we can use doxygen or with a slightly more modern approac cldoc. The disadvantage of the latter one is that it's not available in most the distros (yet), but we can still integrate it into our build process. So which one should be used? After decided which one we want to use, i'm happy to take the coordination of setting up the auto-generation, although obviously wrinting the API comments themselves should be a team effort.

dewyatt commented 7 years ago

A lot of the internal API is documented with doxygen (at least the parts inherited from OpenPGP-SDK). But I know some of the doxygen comments are misleading and out of date.

cldoc certainly looks promising but I do have some concerns there: https://github.com/jessevdk/cldoc/issues/11 If that's still a valid issue, I'm not sure the output would be grouped in a meaningful way. We should look into it maybe.

zgyarmati commented 7 years ago

Indeed, i shouldn't have imply that there are no existing doxygen document comments, on the other hand i more prefer here to the public API, mainly because of the #173 . I didn't spot this bug in cldoc (i've used only for C++), and anyway, i don't have a strong position about using it, so i'm totally fine if we work in doxygen.

ronaldtse commented 7 years ago

It seems that jessevdk/cldoc#11 is not (or not going to be) fixed. Maybe we should stick with doxygen for now and could migrate to another solution afterwards?

dewyatt commented 7 years ago

@flowher mentioned:

Except for static functions, I think comments should go to header files as those files are delivered to the end user (especially for API functions).

There are pros and cons to both sides of this so we should discuss and standardize

Documentation in Header Files

Pros

H0 Users that browse the public header files can read the doxygen comments.

Cons

H1. An excess of documentation in the headers can make it more difficult for users to find the actual functions/types/etc.
H2 The point of using doxygen is to generate HTML/etc documentation, so users don't need to read header files to figure out what's going on.
H3 Changing a comment in a header can trigger a rebuild of the entire source tree, depending on the code base.

Documentation in Source Files

Pros

S0 When a developer updates a function, I think they are much more likely to update the corresponding doxygen comment if it is right above the code.
S1 Similar to the first one, when a developer is updating a function implementation, they can see the contract that the function is supposed to adhere to.

Cons

S2 An excess of documentation in the source files could potentially make it more difficult for a developer to locate the code they are looking for.

You can also do both of these, in various ways. Anyways, I slightly lean towards preferring doxygen comments in source files personally.

Some relevant links:

SO - Where to put the doxygen comment blocks for an internal library - in H or in CPP files?

ni4 commented 7 years ago

I am voting for comments in the source files as well - headers should be short and easily readable. User anyway will have access to the sources or doxygen-generated documentation. And we need a template for the function comment, so it looks consistent everywhere.

I can write it and we will discuss it in the PR comments, what do you think?

dewyatt commented 7 years ago

Maybe we can have a vote on this comment (how to vote).

:thumbsup: = doxygen comments in source files :thumbsdown: = doxygen comments in header files comment = something else

dewyatt commented 7 years ago

Another thing perhaps worth a vote:

:thumbsup: doxygen markdown-style

/**
 This is *really* important

 @param src the source
 @param dest the destination
 */

:thumbsdown: standard doxygen

/**
 This is \em really important

 \param src the source
 \param dest the destination
 */

There's a lot of middle ground here and we're actually voting on two separate things but it's not terribly important as long as we're consistent.

Right now we have a mixture.

I find the first to be a bit easier to read myself, but more importantly I think every developer is likely to be familiar with markdown today.

dewyatt commented 7 years ago

@riboseinc/rnp Two quick votes on some documentation-related items, input appreciated.

I would just rather not force anything on this without some input first. Ronald can make an executive call if he wants to speed things along, but otherwise we ought to maybe give this some time for input.

kriskwiatkowski commented 7 years ago

This is what I'm using

/* -----------------------------------------------------------------------------
 * @brief   brief      
 *          
 * @param   param   
 *          
 * @remarks (optional) remarks 
 * 
 * @pre (optional) assumptions about parameters (like must be not null)
 *
 * @returns (if returns)
 *
-------------------------------------------------------------------------------- */

kriskwiatkowski commented 7 years ago

So, there are two types of comments - those which explain implementation and those which explain interface. Here we are talking about comments for function interface - as those comments which explain implementation are closest possible to the block that needs explanation.

My arguments for comments in header files are following:

Comment above the function should explain interface of the function - parameters, return values and what the function does. It explains declaration of the function, so should be close to the declaration (which are in the headers). Comments should follow changes in the interface.
We will have comments in the header files anyway. That's because structures, macros, #defines etc. are placed only in the headers. It's cleaner to keep all the comments in the same place.
Sometimes you may have same interface implemented in many different ways (f.e. you have implemented two key storage solutions in rnp). What you want, is to have comment about interface to be placed in one single place and not copy it in multiple .c files (once again, comments about interface not implementation).
Depends probably on experience, but personally I find more often comments in header files. boringssl, android, linux, mbedtls (I don't like this argument, we should do what's smart, not what's common).

Additionally:

Comments in the headers should say minimum needed about implementation - because users should develop to the interface which doesn't change, not to the implementation (which may change very often and that's good thing).
Comments about implementation should be rare as code ideally should explain itself and big comment usually means that someone tries to explain overcomplicated code.
It is possible to generate doxygen from comments in headers and from sources. It doesn't make any difference where those comments are placed. So having possibility to find them in either doxygen or headers is rather convenience than disadvantage (with very little value).
cons H2. So users will look at doxygen if they prefer to do it. I see it as false argument
cons H3. Is it really important?
pro S0-S1: those arguments are exactly the same. Hope I've explained difference I see between implementation and interface.
cons S2 and H1: Usually I have a problem with lack of documentation not excees. I would love to have this problem! I don't think it would make much difference to me if there is too much good documentation.

You can also do both of these, in various ways. Anyways, I slightly lean towards preferring doxygen comments in source files personally.

We definitely don't want to have copy of same comment in both places.

But then, as I've wrote in the PR, personally I would prefer it in headers, as it makes more sense to me and I find it quite useful. But if it's in sources - cool, then user just needs to find out where is a doxygen (I believe it is currently possible to generate doxygen somehow, right?). Anyways, documentation is like wine - when it's good it's very, very good; but when it's bad it's better then nothing :) So let's have very, very good one (and this is why I've used "wine" instead of...)

kriskwiatkowski commented 7 years ago

arghh... I've forgotten to add botan

dewyatt commented 7 years ago

Comment above the function should explain interface of the function - parameters, return values and what the function does.

Unfortunately, then you end up with parameter names in 3 different places - the function declaration, the doxygen comment, and the function definition/impl. This is probably why netpgp never put parameter names in headers, and used doxygen comments in the source files only.

It explains declaration of the function, so should be close to the declaration (which are in the headers). Comments should follow changes in the interface.

It would explain more than the declaration. For example, it may say xyz should be initialized prior to using this function. This is not part of the declaration and is part of the contract of the function.

We will have comments in the header files anyway.

Sure, we would have some small comments in the headers, but not lengthy doxygen docs. With the doxygen comments in the headers we'll end up with something like a 20:1 comment:code ratio in the headers, which makes it harder for a developer to locate an actual declaration (or lack thereof).

Sometimes you may have same interface implemented in many different ways (f.e. you have implemented two key storage solutions in rnp). What you want, is to have comment about interface to be placed in one single place and not copy it in multiple .c files (once again, comments about interface not implementation).

Doxygen is surprisingly flexible so I can't see this being an issue.

cons H2. So users will look at doxygen if they prefer to do it. I see it as false argument

Users are likely (soon) going to do something like yum -y install librnp && xdg-open /usr/share/doc/librnp/html/index.html. Or they'll view one of the other various documentation formats generated by doxygen, like man librnp.

cons H3. Is it really important?

It's not a huge deal at the moment, but the more the code base grows, the longer it will probably take to compile. But perhaps more importantly, developers will use any reason to avoid updating the doxygen comments in a file other than the file they just edited.

pro S0-S1: those arguments are exactly the same. Hope I've explained difference I see between implementation and interface.

These are very different. S0 is referring to, for example, function signatures. S1 is referring to the contract that a function has. An example:

/**
 Formats a thing as a string.

 @param[out] dest The destination buffer. This will *always* be NULL-terminated.
 @param[in] size The size of the destination buffer. This should always be >= 1.
 @param[in] thing The thing.

 @return the formatted string
*/
char *format_thing(char *dest, const int size, int thing) {
    char *ptr;    
    ptr = &dest[size - 1];    
    *ptr = '\0';

    // do other stuff
    return ptr;
}

If a developer comes in and makes a change to this function:

char *format_thing(char *dest, const int size, int thing) {
    char *ptr;    
    ptr = &dest[size - 1];    
    if (!is_thing_valid(thing))
        return NULL;

    *ptr = '\0';

    // do other stuff
    return ptr;
}

Now the function is violating at least two parts of its contract. The developer can easily overlook this if the contract is located in an entirely different file.

We definitely don't want to have copy of same comment in both places.

Agreed and I wasn't suggesting this, I was referring to the idea of @brief comments in headers and details like @param in source files, or using @copydoc, or the various other things that doxygen supports.

kriskwiatkowski commented 7 years ago

Unfortunately, then you end up with parameter names in 3 different places

no diff if you have it in .h or .c

This is not part of the declaration and is part of the contract of the function.

This describes how interface works. Should go to .h

Sure, we would have some small comments in the headers, but not lengthy doxygen docs

We should have whatever constitutes good description.

"@return the formatted string"

Nop, that's wrong. It returns pointer. Pointer can be NULL.

dewyatt commented 7 years ago

nop, 2. doxygen is autogenerated. no diff if you have it in .h or .c

When you put doxygen comments in headers:

thing.h

/**
 Does a thing.

 @param thing The thing <---- one
*/
void do_thing(int thing); //<---- two

thing.c


void do_thing(int thing) { //<---- three
  //...
}

Nop, that's wrong. It returns pointer. Pointer can be NULL.

It returns a string, as specified by the contract. NULL is not a string. Furthermore, the caller would be violating the contract if dest was NULL because size has to be >= 1 and size is the size of the dest buffer. In the second version the function violates the contract by returning a non-string and by not NULL-terminating the buffer in all cases.

kriskwiatkowski commented 7 years ago

Ok, 3. Still - no diff where you put the comment. Except if you don't have function in .h. Then you have static function, right? But we agreed then you need comments in .c. Hope that's explains.

It returns a string, as specified by the contract. NULL is not a string.

Let me put it this way - you changed behaviour of API. It's still binary compatible, but doesn't behave same way, it's not backward compatible - now it returns NULL. You need to update your comment. No difference if it is in .c or .h. Is it easier to overlook? yes it is, I agree. Still this argument doesn't convince me.

ronaldtse commented 7 years ago

I think both approaches have merits, but my main considerations are:

echoing @dewyatt developers are lazy to change other files outside of the .c where changes are made.
@flowher is right that the public interface should be stable, and any change to the interface should be reflected in the .h too. This also means if the interface changes, the developer should also update the .h and therefore the comments there.

Personally I believe doxygen comments should be in .h files because they are user facing -- the user does not care about the implementation, but only the interface (and parameters) (i.e. the contract). But the argument is that the doxygen generated documentation is the contract between developers and users. So at this age of day it really doesn't matter where it is put given we have generated doxygen documentation 😄

(I voted for .h)

dewyatt commented 7 years ago

@ni4 Can you come up with the template for this and do a PR on the dev guide?

It looks like the votes are for doxygen comments in headers and markdown styling.

EDIT: typo

ni4 commented 7 years ago

@dewyatt Ok, sure. Will add it.

catap commented 7 years ago

anyway, I think to request documentation for each static function it is too many, isn't it?

dewyatt commented 7 years ago

@catap What we currently have in the PR makes it optional. My thoughts are that it shouldn't be a requirement to document static functions, but it should be OK for any developer to add that documentation if he/she thinks it is of benefit.

kriskwiatkowski commented 7 years ago

+1

catap commented 7 years ago

@dewyatt I see new changes and it is ok.

Anyway when I wrote a comment, a PR has only 37958faf5fe0cf1d2c367446d91cad723f539ae8 where was:

Documentation is done in Doxygen comments format, which must be put in header files. Exception are static or not exposed to the public API functions - they should be documented in the code.

rnpgp / rnp