Open collindutter opened 1 month ago
Note that this PR's base branch is refactor/artifacts
so no tests have run yet.
Attention: Patch coverage is 95.58824% with 9 lines in your changes missing coverage. Please review. |
Files with missing lines | Patch % | Lines |
---|---|---|---|
griptape/loaders/text_loader.py | 71.42% | 1 Missing and 1 partial :warning: | |
griptape/utils/file_utils.py | 80.00% | 1 Missing and 1 partial :warning: | |
.../drivers/file_manager/local_file_manager_driver.py | 75.00% | 0 Missing and 1 partial :warning: | |
griptape/loaders/blob_loader.py | 80.00% | 1 Missing :warning: | |
griptape/tasks/base_image_generation_task.py | 0.00% | 1 Missing :warning: | |
griptape/tools/file_manager/tool.py | 87.50% | 0 Missing and 1 partial :warning: | |
griptape/tools/image_query/tool.py | 0.00% | 1 Missing :warning: |
:loudspeaker: Thoughts on this report? Let us know!
Describe your changes
Added
BaseFileLoader
for Loaders that load from a path.BaseLoader.fetch()
method for fetching data from a source.BaseLoader.parse()
method for parsing fetched data.BaseFileManager.encoding
to specify the encoding when loading and saving files.BaseWebScraperDriver.extract_page()
method for extracting data from an already scraped web page.TextLoaderRetrievalRagModule.chunker
for specifying the chunking strategy.file_utils.get_mime_type
utility for getting the MIME type of a file.Changed
BaseFileManager.default_loader
andBaseFileManager.loaders
.fileutils.load_file
andfileutils.load_files
.loaders-dataframe
andloaders-audio
extras as they are no longer needed.TextLoader
,PdfLoader
,ImageLoader
, andAudioLoader
now take astr | PathLike
instead ofbytes
.DataframeLoader
.LocalFileManagerDriver.workdir
is now optional.filetype
is now a core dependency.FileManagerTool
now usesfiletype
for more accurate file type detection.BaseFileLoader.load_file()
will now either return aTextArtifact
or aBlobArtifact
depending on whetherBaseFileManager.encoding
is set.The purpose of this PR was to clean up the Loader interface, and their define purpose.
Loaders fetch data from a source, and parse it into Artifacts. Loaders do not chunk data, that is the role of Chunkers.
We provide 4 top level Loaders that provide from a variety of sources:
BaseFileLoader then has subclasses that provide file-type specific parsing logic:
Issue ticket number and link
Closes https://github.com/griptape-ai/griptape/issues/1102
š Documentation preview š: https://griptape--1116.org.readthedocs.build//1116/