Open vkehfdl1 opened 1 month ago
As alternative, we can build example jupyter notebook.
To support AWS well, I think it is better to use fsspec. Unified interface for loading files! We are now only support pdf, so loading pdf files from all kinds of file system.
Below is the full fsspec supported protocol.
It contains dropbox, google drive, S3, even jupyter & github!
['abfs',
'adl',
'arrow_hdfs',
'asynclocal',
'az',
'blockcache',
'box',
'cached',
'dask',
'data',
'dbfs',
'dir',
'dropbox',
'dvc',
'file',
'filecache',
'ftp',
'gcs',
'gdrive',
'generic',
'git',
'github',
'gs',
'hdfs',
'hf',
'http',
'https',
'jlab',
'jupyter',
'lakefs',
'libarchive',
'local',
'memory',
'oci',
'ocilake',
'oss',
'reference',
'root',
's3',
's3a',
'sftp',
'simplecache',
'smb',
'ssh',
'tar',
'wandb',
'webdav',
'webhdfs',
'zip']
Is your feature request related to a problem? Please describe. I want to connect Amazon S3 as the file loader to parse to the VectorDB.
Describe the solution you'd like Use Langchain or LlamaIndex (or something better) one to connect many document source to parsing.
Describe alternatives you've considered We can use other library, like liteLLM for getting documents.