cppreference: Parse reference pages with members

jchristgit commented 7 years ago

A couple of pages like std::vector or std::thread have multiple member types and member functions that are currently not being picked up by the parser. Simply adding a new key for these would be the easiest, as checking for the type of a symbol can easily be done based on this.

jchristgit commented 7 years ago

Alright, I thought around a bit more about this. 🤔

What I think would be useful here is having two different types of C++ symbols, with the type being specified inside the JSON for each symbol.

{
    "type": 0,
    ...
    "link": "..."
}

The type could simply be integers since it's only used inside of docextract.py and remembering these is easy. I propose the following:

0 -> A function with parameters and return values, such as std::abs.
1 -> A type with member functions and types, such as std::vector Further types could be added as needed, but these though should cover around 95% of the symbols found (and as of writing this post, I haven't found any reference pages that don't.

Inside of docextract.py, a simple check could be made and extracting an Embed from different types could be put into functions, something along the lines of this:

if symbol['type'] == 0:
    return create_symbol_func_embed(symbol)
elif symbol['type'] == 1:
    return create_symbol_type_embed(symbol)

jchristgit commented 7 years ago

I dug around in this a bit further today. 😃😃😃 The 'type' key is implemented and docextract.py generates different Embeds depending on the specified type. I also split up parsing a symbol type and a symbol function into different functions inside the parser. The issue with parsing member types is that every (symbol type) reference page contains multiple elements which are selected using table.t-dsc-begin, although we only want a specific one. I suppose this could be achieved with a simpler helper function that extracts the header for the table along with its contents and then returns the table with a specified name.

Since @aeshthetic is assigned to this anyways - can you work on this? 👍

strinking / docflow

cppreference: Parse reference pages with members #14