ocaml / odoc

Documentation compiler for OCaml and Reason
Other
321 stars 88 forks source link

@hidden would be a useful tag to have on items #578

Open jonludlam opened 3 years ago

jonludlam commented 3 years ago

Rather than putting the item between (**/**) comments - it's easier for ppxes.

desirekaleba commented 2 years ago

Hi @jonludlam, I am an Outreachy intern looking at being a potential contributor to this project. I have successfully cloned the repository and got it up and running using this link.

I was trying to take on some issues on v3.ocaml.org-server repo and managed to raise a PR - (170) but I just noticed that the repository is not listed on the Outreachy website.

Can you please throw some light on this issue for me to start working on it?

jonludlam commented 2 years ago

Hi @desirekaleba - OK, this'll be a challenging item I think!

First of all I suggest you look here to see how we can currently hide items with the use of 'stop comments'. The idea of this issue is that we want an additional way of hiding items from the docs by using an OCaml attribute. Attributes are explained in the manual here but it might be useful to give an example here:

type t = int [@@hidden]

here we've declarated a type t, and we have attached the attribute 'hidden'. We would like that to be equivalent to the following code:

(**/**)
type t = int
(**/**)

The first place to look for this is the place where we currently decide whether items are hidden or not, and that is in the file ident_env.cppo.ml in the two functions extract_signature_tree_items and extract_structure_tree_items. Start by having a look at those two and see if you can understand how we currently handle the stop comments. Hint: Documentation comments in OCaml start with (** and end with *). Given that fact, and that stop comments are parsed as documentation comments, what would the content of a stop comment be?

Once you've figured that out, the next step is to add in a check to see if the item has an attribute [@@hidden] attached to it. You can see some code to look at attributes in doc_attr.ml. For the case of the type example above, the code to look at the ident of the type (t in the example above), is here - decl is the record representing the type declaration, and typ_id is the field that contains the ident. The attributes of the type declaration are in the field typ_attributes - which can be scanned for the hidden attribute, and if it's there we can mark the type as hidden. We'll then want to do the same with all of the other kinds of items - modules, module types, values, etc.

I've tried to give enough here to be able to completely address the issue, but I don't really expect you to be able to do it without asking further questions. So give it a go, see how far you get, and let us know when you get stuck!

jonludlam commented 2 years ago

I should mention that there are a bunch of #if ... #else ... #endif statements in there - this isn't OCaml code but preprocessor directives just like C. OCaml doesn't natively support these, but there's a preprocessor called cppo that we run the source through before giving it to the compiler. We use these to ensure this one source file can compile with all the different versions of the OCaml compiler - the problem being that we're using the compiler's ASTs, and it occasionally changes from version to version.

desirekaleba commented 2 years ago

Thank you for all these details, @jonludlam. I am currently giving it a shot and I will be updating you.

desirekaleba commented 2 years ago

Hi @jonludlam, I took some time to go through the cppo and the ident_env.cppo.ml file. This has given me an understanding of what we really want to achieve and how can that be achieved. Below are a few points I can talk about so far:

If I am wrong so far, I would like to have some light on what I don't get yet.

But I have just noticed that I need to brush up on my Ocaml skills in order to proceed with this issue. I will keep updating you on what progress I have made on this.

jonludlam commented 2 years ago

Right so far! Can you see how and why the value of hide_items changes?

desirekaleba commented 2 years ago

@jonludlam, I have been going through most of the topics including but not limited to if_statements_loops_and_recursion, lists, data_types_and_matching, etc. to help me understand how and why the value of hide_item is changing but there is a code snippet I don't really understand.

List.map (fun decl -> `Type (decl.typ_id, hide_item))
      decls @ extract_signature_tree_items hide_item rest

What I don't really get here is

`Type (decl.typ_id, hide_item)

Is it a way of creating our type item(eg: modules, module types, values, etc.)?

desirekaleba commented 2 years ago

I also found this part which I think is checking for the presence of /* and decides to hide it or not.

Some ("/*", _) -> extract_signature_tree_items (not hide_item) rest 
jonludlam commented 2 years ago

OK, let me explain a little further. This is more detail than you really need, but might help you to understand what's going on here.

Each element in a signature has a unique identifier assigned by the compiler - essentially a pair of the name of the element and a number, which is required because the name is not unique enough by itself. You can actually see this: Given a file 'test.mli' containing the following:

type t
val f : t

You can run the compiler on this and ask it to show you how understands this code by running ocamlc -dtypedtree test.mli, which gives the following output:

[
  signature_item (test.mli[1,0+0]..test.mli[1,0+6])
    Tsig_type Rec
    [
      type_declaration t/81 (test.mli[1,0+0]..test.mli[1,0+6])
        ptype_params =
          []
        ptype_cstrs =
          []
        ptype_kind =
          Ttype_abstract
        ptype_private = Public
        ptype_manifest =
          None
    ]
  signature_item (test.mli[2,7+0]..test.mli[2,7+9])
    Tsig_value
    value_description f/82 (test.mli[2,7+0]..test.mli[2,7+9])
      core_type (test.mli[2,7+8]..test.mli[2,7+9])
        Ttyp_constr "t/81"
        []
      []
]

We have here 2 'signature_items' - a Tsig_type and a Tsig_value, corresponding to the type t and the val f : t in our code. The type declaration has been given ident t/81, and the value has been given the ident f/82. You can also see that the value description has a reference in it to Ttyp_constr "t/81" - so we know that it's the type t defined just above, and not some other type t anywhere else.

These idents are not quite detailed enough for odoc - odoc wants to be able to link from uses of type to the definition, so instead of having t/81, odoc's identifier looks a bit more like Test.t - that is, it's the t declared in module Test. So every time we see t/81 we need to know that it actually means Test.t - we need a map from OCaml's idents to odoc's identifiers.

The two functions you're currently looking at are part of the process of creating this map. They go over the list of items declared in a signature or structure and construct a list of the idents of the items declared in it. So given our signature above in test.mli, the output of this function would be:

[ `Type("t/81"); `Value("f/82") ]

In fact, the reason you're looking at this function is it's actually the most convenient place to decide whether an identifier is hidden or not (currently via the stop comments, (**/**)), so we actually return a bit more info than just the OCaml ident, we return a boolean representing whether the item should be hidden or not. So the output is actually:

[ `Type("t/81",false); `Value("f/82",false) ]

Now, with this in mind, let's take a look at the code snippet you posted:

  | { sig_desc = Tsig_type (_, decls); _} :: rest ->
    List.map (fun decl -> `Type (decl.typ_id, hide_item))
      decls @ extract_signature_tree_items hide_item rest

Now, OCaml allows mutually recursive type declarations, so something like:

type t = Foo of u
and u = Bar of t

and the way this is represented in the compiler is a single Tsig_type with a list of type declarations in it. In fact, even simple type declarations are represented like this, they just have a single declaration in the list! You can see this in the output above, where the Tsig_type has square brackets indicating a list (with only one entry in our case):

signature_item (test.mli[1,0+0]..test.mli[1,0+6])
    Tsig_type Rec
    [
      type_declaration t/81 (test.mli[1,0+0]..test.mli[1,0+6])
      ...
    ]

whereas the value doesn't:

 signature_item (test.mli[2,7+0]..test.mli[2,7+9])
    Tsig_value
    value_description f/82 (test.mli[2,7+0]..test.mli[2,7+9])
      ...

So in fact each Tsig_type will contain a list of type declarations, decls, and we have to deal with that. Let's break apart the expression in the snippet:

List.map (fun decl -> `Type (decl.typ_id, hide_item)) decls

This part goes over each declaration in the list of decls and constructs the value `Type (decl.typ_id, hide_item) for each one. So in our case, since decls will contain only one type declaration, our t/81, the value of this will be [ `Type ("t/81", false) ]. The rest of the expression is this:

... @ extract_signature_tree_items hide_item rest

The @ symbol is the list concatenate operator, and the right-hand side of the @ is the recursive call to ourself to process the rest of the signature. In our case, the rest of the signature is just the value declaration, so we'll get to this part of the function:

| { sig_desc = Tsig_value {val_id; _}; _ } :: rest->
      [`Value (val_id, hide_item)] @ extract_signature_tree_items hide_item rest 

Values aren't represented as list so we just have a single Tsig_value, so there's no List.map, and we just construct the value [ `Value ("f/82",false) ], and concatentate this with the results of the recursive call on the rest of the signature. For our example, we've reached the end of the signature, so the result of extract_signature_tree_items hide_item rest will just be the empty list, []. This is then appended to [ `Value("f/82",false) ] (which does nothing!), and then we return to concatenate this with [ `Type ("t/81", false) ], giving the final value:

[ `Type("t/81",false); `Value("f/82",false) ]

This list will then be used to construct the identifiers for our type and value elsewhere in the code, but we don't need to worry about that.

What you'll be needing to do is to see if the attribute [@@hidden] has been attached to any of the items in the signature, and if so, make sure that the boolean is set to true. In fact, you can use -dtypedtree to see the attributes too - so if we modify our test.mli a bit:

type t [@@hidden]
val f : t [@@hidden]

then ocamlc -dtypedtree test.mli shows the following:

[
  signature_item (test.mli[1,0+0]..test.mli[1,0+17])
    Tsig_type Rec
    [
      type_declaration t/81 (test.mli[1,0+0]..test.mli[1,0+17])
        attribute "hidden"
          []
        ptype_params =
          []
        ptype_cstrs =
          []
        ptype_kind =
          Ttype_abstract
        ptype_private = Public
        ptype_manifest =
          None
    ]
  signature_item (test.mli[2,18+0]..test.mli[2,18+20])
    Tsig_value
    value_description f/82 (test.mli[2,18+0]..test.mli[2,18+20])
      attribute "hidden"
        []
      core_type (test.mli[2,18+8]..test.mli[2,18+9])
        Ttyp_constr "t/81"
        []
      []
]

so both of our signature_items now have an attribute hidden. The question is: How do you detect that in our function?

desirekaleba commented 2 years ago

@jonludlam, thank you for the above details. This has given me a clear understanding of what's happening.

Answering the question, the way to detect the hidden attribute in our function is to check whether decl.typ_attributes contains hidden. If it's present we are going to construct something like this

`Type (decl.typ_id, not hide_item)

For example, If our test.mli has the following content

type t [@@hidden]

This should give us

[ `Type("t/81",true) ]

Note: This is applicable only when our signature item is represented as a list.

If the signature item is not represented as a list, We will have to destructure the attributes key and scan it for any hidden value. Something like

{ sig_desc = Tsig_value {val_id; val_attributes; _}; _ }
Julow commented 2 years ago

I think you are on the right track!

As an inspiration, documentation comments are stored as attributes too in the Tast, meaning there's many places where attributes are inspected like this:

  let doc = Doc_attr.attached_no_tag label_container cd.cd_attributes in

This should give us

That's right :) but the hidden value shouldn't be obtained from not hide_item, but simply be true when there's the @hidden attribute.

desirekaleba commented 2 years ago

@Julow, Thank you for this. I think I will soon clear this out.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. The main purpose of this is to keep the issue tracker focused to what is actively being worked on, so that the amount and variety of open yet inactive issues does not overwhelm contributors.

An issue closed as stale is not rejected — further discussion is welcome in its closed state, and it can be resurrected at any time. odoc maintainers regularly check issues that were closed as stale in the past, to see if the time is right to reopen and work on them again. PRs addressing issues closed as stale are as welcome as PRs for open issues. They will be given the same review attention, and any other help.

ccatherinee commented 1 year ago

Hi @jonludlam! I know this issue has gone stale, but I found this thread/issue super interesting so I took a look at it. Specifically, I think I've gotten the desired behaviour when it comes to the Tsig_val case (code below):

| { sig_desc = Tsig_value {val_id; val_attributes; _}; _ } :: rest->
    let f (init : bool) (val_attr: Parsetree.attribute) = 
      if init then init 
      else if val_attr.attr_name.txt = "hidden" then true 
      else false in 
    let hidden = List.fold_left f false val_attributes in 
    if hidden then [`Value (val_id, true)] @ extract_signature_tree_items hide_item rest else [`Value (val_id, hide_item)] @ extract_signature_tree_items hide_item rest 

This code occurs in the function extract_signature_tree_items in the file odoc/src/loader/ident_env.cppo.ml. To summarise, now when the user writes val f : t [@@hidden],val f: tis hidden in the documentation (I've tested this out by writing my own test file using Dune cram style). I was wondering if the code snippet above is similar to what you had in mind. I'm planning on doing something similar for each case (Tsig_type, Tsig_module, etc.) and factoring similar parts out in the end, if possible. Does this sound reasonable?

Furthermore, I'm also about to start working on the Tsig_modtype case and was wondering if you could shed some light on the desired behaviour: In this example:

module type S = sig [@@hidden]
  type t 
  val f : t
end

I suppose the entire module should be hidden, correct? However in this example:

module type S = sig
  type t [@@hidden]
  val f : t
end

only the type t part should be hidden, but not the entire module? And the following example

module type S = sig
  type t 
  val f : t
end [@@hidden] 

should instead hide what comes after the module? Thank you so much!

Julow commented 1 year ago

@ccatherinee In your code, you can use List.exists instead of List.fold_left with an if init inside.

Your first example is a syntax error, so there's nothing to worry about. Your third example attaches the [@@hidden] attribute to the module type S (it is attached to the Tsig_modtype) and has nothing to do with what comes after.

OCaml has a syntax for "standalone" attributes: [@@@hidden]. These are Tsig_attribute in the typed tree and would mean "hide everything that comes after the attribute".

ccatherinee commented 1 year ago

Hi @Julow - thank you so much for the detailed response. Just to clarify, does this mean my understanding of

module type S = sig
  type t [@@hidden]
  val f : t
end

is correct (ie only the type t part should be hidden, but not the entire module) or do these sentences

OCaml has a syntax for "standalone" attributes: [@@@hidden]. These are Tsig_attribute in the typed tree and would mean "hide everything that comes after the attribute".

mean that

module type S = sig
  type t [@@hidden]
  val f : t
end

is a syntax error and we can only have

module type S = sig
  type t [@@@hidden]
  val f : t
end

Furthermore, does

hide everything that comes after the [standalone] attribute

mean all remaining code in the file or just code in that module which comes after the standalone attribute? Thank you so much for your time!

lpw25 commented 1 year ago

I'm slightly surprised that we're going for:

type t [@@hidden]

rather than:

(** @hidden *)
type t

which would be more consistent with things like:

(** @inline *)
include Foo

Sorry to ask about this late in the day. Although, if we did decide to do this instead, I suspect adjusting the implementation to do so wouldn't need to change much.

jonludlam commented 1 year ago

I'm flummoxed. I can't see why I turned this from a tag (originally here: https://github.com/ocaml/odoc/pull/571#discussion_r566101709, and in the title of this issue!) into an attribute. The only annoying thing about an not using an attribute is that we'll have to change odoc-parser to do this... :-/

Julow commented 1 year ago

@jonludlam Which is implemented by @3Rafal and waiting for your approval and merge: https://github.com/ocaml-doc/odoc-parser/pull/16

jonludlam commented 1 year ago

Turns out using an attribute is rather a lot more annoying than an attribute as we haven't parsed any comments at the point we're hiding items.