Unstructured-IO / unstructured-api-tools

Apache License 2.0
28 stars 10 forks source link

Added notebooks for tests #142

Closed kravetsmic closed 1 year ago

kravetsmic commented 1 year ago

Added notebooks which are listed in Issue #104

cragwolfe commented 1 year ago

The notebooks look good except for one issue with the pipeline-process-text-file-... notebooks, which is that exactly one of the text or file args will be None (and one not None) when invoked from the a FastAPI. So, as written they'll have a stack trace if the file is None.

Also, you'll need to run make generate-test-api to generate the API's under test_unstructured_api_tools/pipeline-test-project/prepline_test_project/api/ .

Good start, of course lots of unittests to go :)

cragwolfe commented 1 year ago

From: https://github.com/kravetsmic/unstructured-api-tools/blob/kravetsmic/add-additional-notebooks/test_unstructured_api_tools/pipeline-test-project/pipeline-notebooks/pipeline-process-text-file-1.ipynb

# pipeline-api
def pipeline_api(
    text,
    file=None,
    filename=None,
    file_content_type=None,
):
    return {"silly_result": ' : '.join([
        str(len(text)),
        text,
        str(len(file.read())),
        filename,
        str(file_content_type),
    ])}

This will throw a stack trace if either text or file is None. One of them always will be, as called from the FastAPI route. It should be able to respond with a valid response in either case. (If text is not None, all the other values are None).