opencobra / memote

memote – the genome-scale metabolic model test suite
https://memote.readthedocs.io/
Apache License 2.0
123 stars 26 forks source link

Errors in test cases #649

Open Midnighter opened 5 years ago

Midnighter commented 5 years ago

The following list of tests sometimes contains None or nan values in the "data" key. This points to exceptions occurring in the code that we are not properly handling (or really messed up models). We need to find out which.

 [1] test_ngam_presence                                  
 [2] test_protein_complex_presence                       
 [3] test_number_independent_conservation_relations      
 [4] test_find_candidate_irreversible_reactions          
 [5] test_blocked_reactions                              
 [6] test_find_reactions_unbounded_flux_default_condition
 [7] test_find_constrained_pure_metabolic_reactions      
 [8] test_find_constrained_transport_reactions           
 [9] test_find_pure_metabolic_reactions                  
[10] test_find_transport_reactions                       
[11] test_metabolic_reaction_specific_sbo_presence       
[12] test_transport_reaction_gpr_presence                
[13] test_transport_reaction_specific_sbo_presence       
[14] test_absolute_extreme_coefficient_ratio             
[15] test_degrees_of_freedom                             
[16] test_matrix_rank
matthiaskoenig commented 5 years ago

Is there any way to distinguish errored tests due to some bug or problem from failed tests which could not be full-filled because of model quality? It would be nice to get the subset of errored tests to investigate more what is going on.

Midnighter commented 5 years ago

At what level? The pytest command line output tells you whether a test failed or errored. The snapshot report also does so by investigating the combination of the test result (failed) with the data attribute (if failed it will still have a valid value; if errored data is invalid).

matthiaskoenig commented 5 years ago

Basically everything which is an AssertionError should be a Failure, whereas all other exceptions should be an Error (which means something went wrong in the test).

Unfortunately, this does not seem to work. Any exception raised during an exception itself is treated as a normal failure in pytest. https://stackoverflow.com/questions/21513718/pytest-error-vs-fail

E.g. In the model attached: tiny_example_12.zip I get for the following tests 2 failures, i.e. FF, instead of EF. The first one is clearly an error, because it was not asserted that the model has genes before doing divisions with the length. Still it is reported as a failure.

____________________ test_gene_product_annotation_presence _____________________

model = <Model tiny_example_12 at 0x7f7a389415c0>

    @annotate(title="Presence of Gene Annotation", format_type="count")
    def test_gene_product_annotation_presence(model):
        """
        Expect all genes to have a non-empty annotation attribute.

        This test checks if any annotations at all are present in the SBML
        annotations field (extended by FBC package) for each gene product,
        irrespective of the type of annotation i.e. specific database,
        cross-references, ontology terms, additional information. For this test to
        pass the model is expected to have genes and each of them should have some
        form of annotation.

        Implementation:
        Check if the annotation attribute of each cobra.Gene object of the
        model is unset or empty.

        """
        ann = test_gene_product_annotation_presence.annotation
        ann["data"] = get_ids(annotation.find_components_without_annotation(
            model, "genes"))
>       ann["metric"] = len(ann["data"]) / len(model.genes)
E       ZeroDivisionError: division by zero

/home/mkoenig/git/memote/memote/suite/tests/test_annotation.py:102: ZeroDivisionError
____________ test_metabolite_annotation_overview[pubchem.compound] _____________

model = <Model tiny_example_12 at 0x7f7a389415c0>, db = 'pubchem.compound'

    @pytest.mark.parametrize("db", list(annotation.METABOLITE_ANNOTATIONS))
    @annotate(title="Metabolite Annotations Per Database",
              format_type="percent", message=dict(), data=dict(), metric=dict())
    def test_metabolite_annotation_overview(model, db):
        """
        Expect all metabolites to have annotations from common databases.

        Specific database cross-references are paramount to mapping information.
        To provide references to as many databases as possible helps to make the
        metabolic model more accessible to other researchers. This does not only
        facilitate the use of a model in a broad array of computational pipelines,
        it also promotes the metabolic model itself to become an organism-specific
        knowledge base.

        For this test to pass, each metabolite annotation should contain
        cross-references to a number of databases. The currently selection is
        listed in `annotation.py`, but an ongoing discussion can be found at
        https://github.com/opencobra/memote/issues/332. For each database this
        test checks for the presence of its corresponding namespace ID to comply

        with the MIRIAM guidelines i.e. they have to match those defined on
        https://identifiers.org/.

        Since each database is quite different and some potentially incomplete, it
        may not be feasible to achieve 100% coverage for each of them. Generally
        it should be possible, however, to obtain cross-references to at least
        one of the databases for all metabolites consistently.

        Implementation:
        Check if the keys of the annotation attribute of each cobra.Metabolite of
        the model match with a selection of common biochemical databases. The
        annotation  attribute of cobrapy components is a dictionary of
        key:value pairs.

        """
        ann = test_metabolite_annotation_overview.annotation
        ann["data"][db] = get_ids(
            annotation.generate_component_annotation_overview(
                model.metabolites, db))
        ann["metric"][db] = len(ann["data"][db]) / len(model.metabolites)
        ann["message"][db] = wrapper.fill(
            """The following {} metabolites ({:.2%}) lack annotation for {}:
            {}""".format(len(ann["data"][db]), ann["metric"][db], db,
                         truncate(ann["data"][db])))
>       assert len(ann["data"][db]) == 0, ann["message"][db]
E       AssertionError: The following 7 metabolites (100.00%) lack annotation for
E         pubchem.compound:         glc, g6p, atp, adp, phos, ...
E       assert 7 == 0
E         -7
E         +0

/home/mkoenig/git/memote/memote/suite/tests/test_annotation.py:153: AssertionError
Midnighter commented 5 years ago

:thinking: you're absolutely right. I'm not immediately sure what the best solution is.

ChristianLieven commented 5 years ago

Yeah I must admit I was entirely oblivious to the fact that pytest does in fact not have a dedicated 'errored' category. This is totally the behaviour I came to expect from it.

Could the annotation decorator be made to discern errored cases by catching any error that is not an assertion error? This would be nice too because then we could propagate that through to the front end too!

Midnighter commented 5 years ago

Not in the decorator but when we retrieve the test results afterwards we can inspect the error and write that out, yes.

matthiaskoenig commented 5 years ago

Yes, propagating to the frontend would be great (e.g. the stacktrack). This would allow very easy bug reporting. I could imagine having a button "send error report" which would then post the "SBML file", "stack track", "library versions" and "call command" as a github issue. At least it must be as easy as possible to get the real errors vs. failed model quality assertions.

Midnighter commented 5 years ago

Oooh, nice ideas. I like it.