BlueBrain / Search

Blue Brain text mining toolbox for semantic search and structured information extraction
https://blue-brain-search.readthedocs.io
GNU Lesser General Public License v3.0
42 stars 11 forks source link

`bluesearch.utils.find_files` can be slow when doing regex #587

Open jankrepl opened 2 years ago

jankrepl commented 2 years ago

🐛 Bug description

We noticed that find_files can get really slow when we try to do recursive matching for a regex pattern.

It would be nice to investigate whether this could be improved

find_files(pathlib.Path("../neuroscience-literature/medrxiv/"), True, ".*\.meca$")
jankrepl commented 2 years ago

Note that this might be due to the fact that we are working on a mounted remote file system