langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.63k stars 15.31k forks source link

Issue: Unable to add a unit test for experimental modules #7451

Closed borisdev closed 1 year ago

borisdev commented 1 year ago

Adding a unit test for any experimental module in the standard location, such as tests/unit_tests/experimental/test_baby_agi.py, leads to this failing unit test:

../tests/unit_tests/output_parsers/test_base_output_parser.py ...................................F
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    def test_all_subclasses_implement_unique_type() -> None:
        types = defaultdict(list)
        for cls in _NON_ABSTRACT_PARSERS:
            try:
                types[cls._type].append(cls.__name__)
            except NotImplementedError:
                # This is handled in the previous test
                pass
        dups = {t: names for t, names in types.items() if len(names) > 1}
>       assert not dups, f"Duplicate types: {dups}"
E       AssertionError: Duplicate types: {<property object at 0xffff9126e7f0>: ['EnumOutputParser', 'AutoGPTOutputParser', 'NoOutputParser', 'StructuredQueryOutputParser', 'PlanningOutputParser'], <property object at 0xffff7f331710>: ['PydanticOutputParser', 'LineListOutputParser']}
E       assert not {<property object at 0xffff9126e7f0>: ['EnumOutputParser', 'AutoGPTOutputParser', 'NoOutputParser', 'StructuredQueryOu...arser', 'PlanningOutputParser'], <property object at 0xffff7f331710>: ['PydanticOutputParser', 'LineListOutputParser']}

../tests/unit_tests/output_parsers/test_base_output_parser.py:55: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /workspaces/tests/unit_tests/output_parsers/test_base_output_parser.py(55)test_all_subclasses_implement_unique_type()
-> assert not dups, f"Duplicate types: {dups}"

Repro is here and artifact here.

dosubot[bot] commented 1 year ago

Answer generated by a 🤖

Answer

I understand that you're having trouble adding a unit test for an experimental module in LangChain. The issue seems to be that when you add a new test file, Python imports all the modules and their dependencies. If one of these dependencies is a subclass of BaseOutputParser, it gets added to the _NON_ABSTRACT_PARSERS list multiple times, causing the test test_all_subclasses_implement_unique_type to fail.

Here's how I would approach this issue: I would modify the non_abstract_subclasses function to return a set instead of a list. This change should automatically remove any duplicates and prevent the test from failing when you add a new test file. Here's how you can do it:

def non_abstract_subclasses(
    cls: Type[ABC], to_skip: Optional[Set] = None
) -> Set[Type]:
    """Recursively find all non-abstract subclasses of a class."""
    _to_skip = to_skip or set()
    subclasses = set()
    for subclass in cls.__subclasses__():
        if not getattr(subclass, "__abstractmethods__", None):
            if subclass.__name__ not in _to_skip:
                subclasses.add(subclass)
        subclasses.update(non_abstract_subclasses(subclass, to_skip=_to_skip))
    return subclasses

This solution should address the issue you're facing. However, please note that this is just one potential solution and there might be other ways to address this issue.

Additionally, it's a good idea to add a separate directory for experimental unit tests to keep them separate from the standard ones. This can help prevent any unexpected issues that might arise from adding experimental features to the main codebase.

If you find this solution helpful and believe it's not covered by existing documentation, I encourage you to make a pull request to update the documentation. This will help other users who might encounter the same issue in the future.

Sources


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

borisdev commented 1 year ago

A partial solution was adding _type method to a subclass

class AutoGPTOutputParser(BaseAutoGPTOutputParser):

def _type(self) -> str:
    return "autogptparser"

But I could not figure out why in langchain/experimental.__init__.py

this line

from langchain.experimental.generative_agents.generative_agent import GenerativeAgent

was breaking the unit tests, especially since that part of the code does not seem to subclass BaseOutputParser?

As a workaround, I will move my unit tests to my integration test module.

dosubot[bot] commented 1 year ago

Hi, @borisdev! I'm Dosu, and I'm helping the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported was about adding a unit test for an experimental module in the standard location, which led to a failing unit test due to duplicate types in the output parsers. I suggested a potential solution by modifying the non_abstract_subclasses function to return a set instead of a list, which should remove any duplicates and prevent the test from failing. You also mentioned a partial solution by adding a _type method to a subclass, but couldn't figure out why a specific line of code was breaking the unit tests. As a workaround, you mentioned that you will move your unit tests to the integration test module.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

borisdev commented 1 year ago

Since experimental is now a separate library I think the issue is obsolete. When I have time, I will move my unit tests out of integration and re-run units tests. TY! Closing.

dosubot[bot] commented 1 year ago

Thank you, @borisdev, for closing the issue. We appreciate your contribution to the LangChain repository!