datasette / datasette-extract

Import unstructured data (text and images) into structured tables
Apache License 2.0
129 stars 3 forks source link

Image support #5

Closed simonw closed 3 months ago

simonw commented 4 months ago

Being able to drag an image - or even multiple images - would be incredible.

simonw commented 4 months ago

For images I'll need https://github.com/Kludex/python-multipart - Datasette doesn't handle file uploads yet.

simonw commented 4 months ago

Probably better to use Starlette: https://github.com/encode/starlette/blob/master/starlette/formparsers.py

As seen in datasette-upload-csvs: https://github.com/simonw/datasette-upload-csvs/blob/main/datasette_upload_csvs/__init__.py

simonw commented 4 months ago

Rats. I forgot that tools are not yet supported by the gpt-4-vision-preview model, so this can't work directly.

Best I could do is use GPT-4 Vision to perform a form of OCR, then send the OCRd text through the model a second time. That's not a bad idea, but it's still frustrating to need to do it.

simonw commented 4 months ago

Got this error:

You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: ['png', 'jpeg', 'gif', 'webp'].",

For an image dragged from iMessage. It was a HEIC.

simonw commented 4 months ago

I can to the HEIC to JPEG conversion in browser like this:


    <input type="file" id="heicInput" accept=".heic">
    <img id="jpegOutput" alt="Converted JPEG" style="max-width: 100%">
    <script type="module">
        import heic2any from 'https://cdn.jsdelivr.net/npm/heic2any@0.0.4/+esm';

        const input = document.getElementById("heicInput");
        const output = document.getElementById("jpegOutput");

        input.addEventListener("change", async (event) => {
            const file = event.target.files[0];
            if (file) {
                try {
                    const blob = await heic2any({
                        blob: file,
                        toType: "image/jpeg",
                        quality: 0.8
                    });
                    output.src = URL.createObjectURL(blob);
                    const jpegFile = new File([blob], "converted.jpeg", { type: "image/jpeg" });
                    const dataTransfer = new DataTransfer();
                    dataTransfer.items.add(jpegFile);
                    document.querySelector('#id_image').files = dataTransfer.files;
                } catch (error) {
                    console.error("Conversion error:", error);
                }
            }
        });
    </script>
simonw commented 4 months ago

Ideally I'd let users drop images, PDFs or text files on the big pink drop target and have it figure out what to do with them - stick images in the image upload field, convert HEICs to JPEGs, etc.

It looks to me like https://cdn.jsdelivr.net/npm/heic2any@0.0.4/+esm is a single bundle that's 1.3MB and includes the HEIC decoder WASM thing as a base64 blob, which is pretty neat.

simonw commented 4 months ago

https://www.npmjs.com/package/heic2any is MIT licensed.