Closed ImVexed closed 1 year ago
Thanks for reporting. Can you elaborate please on the issue and your setup? It seems the conversion happens implicitly on my system, it also works on Google Colab which is a similar setup (Ubuntu 22.04)
I'm on windows 11, When logging logger.info(f"Scanning path for extension: {extension}")
for example I would see Scanning path for extension: DocumentExtension.pdf
and I assume it also caused list(docs_path.glob(f"**/*.{extension.value}"))
to be list(docs_path.glob(f"**/*.DocumentExtension.pdf"))
Considering that DocumentExtension
is an Enum, It seems to have a key
and value
when used as a explicit type.
And for me, Python 3.11's enum.__str__
is:
def __str__(self):
return "%s.%s" % (self.__class__.__name__, self._name_, )
Thanks for that! I understand now, looks the problem is in different behaviour of Enum on Python 3.10 (which was used to develop this package) vs 3.11. In 3.10 has an implicit conversion to string, which doesn’t happen in 3.11.
I will work on compatibility with 3.11. If you have an access to virtualenv with 3.10, it should work hopefully.
Fixed in version 0.3.2
https://github.com/snexus/llm-search/blob/main/src/llmsearch/parsers/splitter.py#L53 and many other places in the file seem to access
extension
like it's a string, but it's an object. Causing the splitter to find no files. This was causing the examples to not work for me.