seqan / product_backlog

This repository is used as product backlog for all SeqAn relevant backlog items. This is intended to organise the work for the team.
2 stars 1 forks source link

[INFRA] list missing entities in doxygen #384

Open marehr opened 3 years ago

marehr commented 3 years ago

Description

We have a couple of places where some typo or rename broke our linked entities.

For example seqan3::alignment_file_input::bitscore_type should link to field::bitscore (i.e. seqan3::field::bitscore), but the actual entity is called seqan3::field::bit_score.

It would be awesome if we could get some kind of warning that says that he couldn't link an entity.

I know that if we would use explicit links, we would get a warning if something can't be linked.

(I'm not sure why in 3.0.2 the field::* did not link to all entities, in 3.0.3 seqan3::sam_file_input::bitscore_type it works?!)

marehr commented 3 years ago

I tried the following

diff --git a/test/documentation/seqan3_doxygen_cfg.in b/test/documentation/seqan3_doxygen_cfg.in
index d5d59d926..adbdc38ae 100644
--- a/test/documentation/seqan3_doxygen_cfg.in
+++ b/test/documentation/seqan3_doxygen_cfg.in
@@ -26,6 +26,7 @@ DOT_GRAPH_MAX_NODES    = 500
 INTERACTIVE_SVG        = ${SEQAN3_DOXYGEN_HAVE_DOT}

 ## MISC OPTIONS
+GENERATE_XML           = YES
 GENERATE_LATEX         = NO
 HTML_TIMESTAMP         = YES
 EXT_LINKS_IN_WINDOW    = YES

which generates XML files (the recommended way in https://www.doxygen.nl/manual/customize.html to fully customise output).

Those XML files are fully annotated. After that I tried the following XSLT to remove all XML Nodes that have a link to an entity

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- recursively go into all nodes -->
<xsl:template match="node()|@*">
<root>
    <xsl:for-each select="//briefdescription|//detaileddescription">
        <xsl:call-template name="description"/>
    </xsl:for-each>
</root>
</xsl:template>

<!-- copy over all briefdescription -->
<xsl:template name="description"> <!--match="briefdescription" -->
    <xsl:copy select=".">
        <xsl:for-each select="node()[not(self::ref) and not(self::programlisting)]">
            <xsl:call-template name="description"/>
        </xsl:for-each>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

This produces via

xsltproc test.xsl classseqan3_1_1sam__file__input.xml > test.xml

a filtered output file.

A simple grep of "::"

grep "::" test.xml

yields

    <para>The type of  (default std::vector&lt;seqan3::dna5&gt;). </para>
    <para>The type of  is fixed to std::optional&lt;int32_t&gt;. </para>
    <para>The type of  is fixed to an std::optional&lt;int32_t&gt;. </para>

    <para>The type of  (default std::vector&lt;seqan3::phred42&gt;). </para>
    <para>The type of  is fixed to std::vector&lt;cigar&gt;. </para>
    <para>The type of  is fixed to std::tuple&lt;ref_id_type, ref_offset_type, int32_t&gt;). </para>
    <para>The type of field::bitscore is fixed to double. </para>
    <para>The type of  (default: sam_file_header&lt;typename traits_type::ref_ids&gt;). </para>
    <para><parameterlist><parameteritem><parameternamelist><parametername>stream_t</parametername></parameternamelist><parameterdescription><pa
ra>The stream type; must model seqan3::input_stream. </para></parameterdescription></parameteritem><parameteritem><parameternamelist><parameter
name>file_format</parametername></parameternamelist><parameterdescription><para>The format of the file in the stream, must model . </para></par
ameterdescription></parameteritem></parameterlist><parameterlist><parameteritem><parameternamelist><parametername>stream</parametername></param
eternamelist><parameterdescription><para>The stream to operate on; must be derived of . </para></parameterdescription></parameteritem><paramete
ritem><parameternamelist><parametername>format_tag</parametername></parameternamelist><parameterdescription><para>The file format tag. </para><
/parameterdescription></parameteritem><parameteritem><parameternamelist><parametername>fields_tag</parametername></parameternamelist><parameter
description><para>A  tag. [optional]</para></parameterdescription></parameteritem></parameterlist>
    <para><parameterlist><parameteritem><parameternamelist><parametername>stream_t</parametername></parameternamelist><parameterdescription><pa
ra>The stream type; must model seqan3::input_stream. </para></parameterdescription></parameteritem><parameteritem><parameternamelist><parameter
name>file_format</parametername></parameternamelist><parameterdescription><para>The format of the file in the stream; must model . </para></par
ameterdescription></parameteritem></parameterlist><parameterlist><parameteritem><parameternamelist><parametername>stream</parametername></param
eternamelist><parameterdescription><para>The stream to operate on; must be derived of . </para></parameterdescription></parameteritem><paramete
ritem><parameternamelist><parametername>ref_ids</parametername></parameternamelist><parameterdescription><para>A range containing the reference
 ids that correspond to the SAM/BAM file. </para></parameterdescription></parameteritem><parameteritem><parameternamelist><parametername>ref_se
quences</parametername></parameternamelist><parameterdescription><para>A range containing the reference sequences that correspond to the SAM/BA
M file. </para></parameterdescription></parameteritem><parameteritem><parameternamelist><parametername>format_tag</parametername></parameternam
elist><parameterdescription><para>The file format tag. </para></parameterdescription></parameteritem><parameteritem><parameternamelist><paramet
ername>fields_tag</parametername></parameternamelist><parameterdescription><para>A  tag. [optional]</para></parameterdescription></parameterite
m></parameterlist>
    <para>In the above example, <computeroutput>rec</computeroutput> has the type  which is a specialisation of  and behaves like an  (that's w
hy we can access it via <computeroutput>get</computeroutput>). Instead of using the  based interface on the record, you could also use <compute
routput>std::get&lt;0&gt;</computeroutput> or even <computeroutput>std::get&lt;dna4_vector&gt;</computeroutput> to retrieve the sequence, but i
t is not recommended, because it is more error-prone.</para>