When SimpleDirectoryReader then tries to get the metadata using extract_blob_meta a KeyError is thrown on the 1st line of the function:
meta: dict = blob_meta[file_path]
Reason being when SimpleDirectoryReader is iterating through the contents of the directory passed in it is not generating the paths, it pulls them from the directory contents.
In my local test,
download-file-path: C:\\Users\\Me\\AppData\\Local\\Temp\\tmpfrm_02oi/myfile.pdf - notice the / prior to the file name
This is how the key for the blob_meta is set.
Then when SimpleDirectoryReader executes extract_blob_meta it is passing in a path of:
C:\\Users\\Me\\AppData\\Local\\Temp\\tmpfrm_02oi\\myfile.pdf - notice the \\ prior to the file name
Bug Description
This may just be a Windows issue.
When importing files using
AzStorageBlobReader
, the variabledownload_file_path
is getting set:The blob metadata is then added with this file path as the key:
When
SimpleDirectoryReader
then tries to get the metadata usingextract_blob_meta
a KeyError is thrown on the 1st line of the function:Reason being when
SimpleDirectoryReader
is iterating through the contents of the directory passed in it is not generating the paths, it pulls them from the directory contents.In my local test,
download-file-path
:C:\\Users\\Me\\AppData\\Local\\Temp\\tmpfrm_02oi/myfile.pdf
- notice the/
prior to the file name This is how the key for theblob_meta
is set. Then when SimpleDirectoryReader executesextract_blob_meta
it is passing in a path of:C:\\Users\\Me\\AppData\\Local\\Temp\\tmpfrm_02oi\\myfile.pdf
- notice the\\
prior to the file nameSuggest switching lines 92 & 110 to:
llama-hub v 0.0.70
Version
0.9.30
Steps to Reproduce
documents = loader.load_data()