edubotics-ai / edubotics-core

Comprehensive set of core modules for vector storage, retrieval, processing, with more to come.
http://docs.edubotics.ai/
MIT License
7 stars 7 forks source link

Add functionality to ignore certain url paths or files #86

Open XThomasBU opened 3 months ago

XThomasBU commented 3 months ago

All the child urls are read, processed and stored in the vectorestore, build functionality to ignore certain url paths or files, for example: defined in storage/data/to_ignore.txt

https://github.com/DL4DS/dl4ds_tutor/blob/e934b90b1d5e9771ed279e085200b30c83c29d60/code/modules/vectorstore/store_manager.py#L49

Farid-Karimli commented 2 months ago

@XThomasBU I think this is appropriate for the vectorstore UI - interactively pick links.