n4ze3m / dialoqbase

Create chatbots with ease
https://dialoqbase.n4ze3m.com/
MIT License
1.54k stars 253 forks source link

Integration of additional document loaders (S3, Google Drive) #81

Closed mjtechguy closed 6 months ago

mjtechguy commented 9 months ago

Hello,

FIrst off, this is an amazing project. Really useful.

Was curious if support was planned for Google Drive, and S3 Compatible Buckets (I guess that is S3 directory or S3 file in langchain).

Thanks!

n4ze3m commented 9 months ago

First of all, thank you for sponsoring me ❤️.

Yes, langchain supports S3 and Google Drive via unstructured.io. I will try to add it as a document loader in upcoming release

mjtechguy commented 9 months ago

Awesome. Thanks sir. Happy to be able to support and hope to be able to increase in the future as some financial stuff clears for me.

n4ze3m commented 9 months ago

I will comment here when I have added it. :) You don't need to sponsor me monthly <3

n4ze3m commented 9 months ago

;) Hey, just a quick update about v0.0.30. You can now load PDF URLs from the website loader, and Added speech-to-text and TTS to the playground UI (you can even use your custom elevenlabs voice ID too)

n4ze3m commented 9 months ago

Hey YouTube as a datasource has been released along with a new PG hybrid retrieval, which can be toggled on the bot settings page

mjtechguy commented 9 months ago

awesome @n4ze3m! Thanks so much. testing tonight.

n4ze3m commented 9 months ago

Hey YouTube, MP3 and MP4 files may take a few minutes (1-5 minutes) depending on their length. We're using TransformerJS Whisper for processing instead of the OpenAI API

mjtechguy commented 9 months ago

@n4ze3m, i tested with a couple of different videos and I get this error on them, regardless of YouTube video length.

I tested this on a Linux VM and on WSL2 Ubuntu on Windows.

The common error is :

dialoqbase           | [ffout] FFMPEG_END
dialoqbase           | [info] run FS.readFile ./audio.wav
dialoqbase           | [Program terminated with exit(0)] undefined

Full output of one of the runs:

dialoqbase           | [info] load ffmpeg-core
dialoqbase           | [info] loading ffmpeg-core
dialoqbase           | [info] ffmpeg-core loaded
dialoqbase           | ffmpeg loaded
dialoqbase           | [info] run FS.writeFile ./audio.mp3 <1388848909 bytes binary file>
dialoqbase           | [info] run ffmpeg command: -i ./audio.mp3 -acodec pcm_s16le -ac 1 -ar 16000 ./audio.wav
dialoqbase           | [fferr] ffmpeg version v0.11.0-6-g16758e9d2b Copyright (c) 2000-2020 the FFmpeg developers
dialoqbase           | [fferr]   built with emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.34 (57b21b8fdcbe3ebb523178b79465254668eab408)
dialoqbase           | [fferr]   configuration: --target-os=none --arch=x86_32 --enable-cross-compile --disable-x86asm --disable-inline-asm --disable-stripping --disable-programs --disable-doc --disable-debug --disable-runtime-cpudetect --disable-autodetect --extra-cflags='-O3 --closure=1 -I/work/ffmpeg.wasm-core/build/include -s USE_PTHREADS=1' --extra-cxxflags='-O3 --closure=1 -I/work/ffmpeg.wasm-core/build/include -s USE_PTHREADS=1' --extra-ldflags='-O3 --closure=1 -I/work/ffmpeg.wasm-core/build/include -s USE_PTHREADS=1 -L/work/ffmpeg.wasm-core/build/lib' --pkg-config-flags=--static --nm=llvm-nm --ar=emar --ranlib=emranlib --cc=emcc --cxx=em++ --objcc=emcc --dep-cc=emcc --enable-gpl --enable-nonfree --enable-zlib --enable-libx264 --enable-libx265 --enable-libvpx --enable-libwavpack --enable-libmp3lame --enable-libfdk-aac --enable-libtheora --enable-libvorbis --enable-libfreetype --enable-libopus --enable-libwebp --enable-libass --enable-libfribidi
dialoqbase           | [fferr]   libavutil      56. 51.100 / 56. 51.100
dialoqbase           | [fferr]   libavcodec     58. 91.100 / 58. 91.100
dialoqbase           | [fferr]   libavformat    58. 45.100 / 58. 45.100
dialoqbase           | [fferr]   libavdevice    58. 10.100 / 58. 10.100
dialoqbase           | [fferr]   libavfilter     7. 85.100 /  7. 85.100
dialoqbase           | [fferr]   libswscale      5.  7.100 /  5.  7.100
dialoqbase           | [fferr]   libswresample   3.  7.100 /  3.  7.100
dialoqbase           | [fferr]   libpostproc    55.  7.100 / 55.  7.100
dialoqbase           | [fferr] Input #0, mov,mp4,m4a,3gp,3g2,mj2, from './audio.mp3':
dialoqbase           | [fferr]   Metadata:
dialoqbase           | [fferr]     major_brand     : mp42
dialoqbase           | [fferr]     minor_version   : 0
dialoqbase           | [fferr]     compatible_brands: isommp42
dialoqbase           | [fferr]     creation_time   : 2023-09-29T09:51:00.000000Z
dialoqbase           | [fferr]   Duration: 02:41:12.95, start: 0.000000, bitrate: 1148 kb/s
dialoqbase           | [fferr]     Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1280x720 [SAR 1:1 DAR 16:9], 1016 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
dialoqbase           | [fferr]     Metadata:
dialoqbase           | [fferr]       creation_time   : 2023-09-29T09:51:00.000000Z
dialoqbase           | [fferr]       handler_name    : ISO Media file produced by Google Inc. Created on: 09/29/2023.
dialoqbase           | [fferr]     Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
dialoqbase           | [fferr]     Metadata:
dialoqbase           | [fferr]       creation_time   : 2023-09-29T09:51:00.000000Z
dialoqbase           | [fferr]       handler_name    : ISO Media file produced by Google Inc. Created on: 09/29/2023.
dialoqbase           | [fferr] Stream mapping:
dialoqbase           | [fferr]   Stream #0:1 -> #0:0 (aac (native) -> pcm_s16le (native))
dialoqbase           | [fferr] Output #0, wav, to './audio.wav':
dialoqbase           | [fferr]   Metadata:
dialoqbase           | [fferr]     major_brand     : mp42
dialoqbase           | [fferr]     minor_version   : 0
dialoqbase           | [fferr]     compatible_brands: isommp42
dialoqbase           | [fferr]     ISFT            : Lavf58.45.100
dialoqbase           | [fferr]     Stream #0:0(eng): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s (default)
dialoqbase           | [fferr]     Metadata:
dialoqbase           | [fferr]       creation_time   : 2023-09-29T09:51:00.000000Z
dialoqbase           | [fferr]       handler_name    : ISO Media file produced by Google Inc. Created on: 09/29/2023.
dialoqbase           | [fferr]       encoder         : Lavc58.91.100 pcm_s16le
dialoqbase           | [fferr] size=       1kB time=00:00:00.04 bitrate= 269.7kbits/s speed=0.0648x
dialoqbase           | [fferr] size=    3840kB time=00:02:06.01 bitrate= 249.6kbits/s speed= 105x
dialoqbase           | [fferr] size=    8192kB time=00:04:22.94 bitrate= 255.2kbits/s speed= 155x
dialoqbase           | [fferr] size=   12544kB time=00:06:42.56 bitrate= 255.3kbits/s speed= 183x
dialoqbase           | [fferr] size=   16896kB time=00:09:01.32 bitrate= 255.7kbits/s speed= 200x
dialoqbase           | [fferr] size=   21504kB time=00:11:28.98 bitrate= 255.7kbits/s speed= 215x
dialoqbase           | [fferr] size=   25856kB time=00:13:55.22 bitrate= 253.6kbits/s speed= 226x
dialoqbase           | [fferr] size=   30464kB time=00:16:16.39 bitrate= 255.6kbits/s speed= 232x
dialoqbase           | [fferr] size=   34816kB time=00:18:40.50 bitrate= 254.5kbits/s speed= 238x
dialoqbase           | [fferr] size=   39680kB time=00:21:15.05 bitrate= 254.9kbits/s speed= 245x
dialoqbase           | [fferr] size=   44544kB time=00:23:45.82 bitrate= 255.9kbits/s speed= 250x
dialoqbase           | [fferr] size=   49152kB time=00:26:20.39 bitrate= 254.8kbits/s speed= 255x
dialoqbase           | [fferr] size=   54016kB time=00:28:55.55 bitrate= 255.0kbits/s speed= 259x
dialoqbase           | [fferr] size=   58368kB time=00:31:14.19 bitrate= 255.1kbits/s speed= 260x
dialoqbase           | [fferr] size=   62976kB time=00:33:38.95 bitrate= 255.5kbits/s speed= 262x
dialoqbase           | [fferr] size=   67584kB time=00:36:06.18 bitrate= 255.6kbits/s speed= 264x
dialoqbase           | [fferr] size=   71936kB time=00:38:27.73 bitrate= 255.4kbits/s speed= 265x
dialoqbase           | [fferr] size=   76800kB time=00:41:02.68 bitrate= 255.5kbits/s speed= 268x
dialoqbase           | [fferr] size=   80896kB time=00:43:16.73 bitrate= 255.2kbits/s speed= 268x
dialoqbase           | [fferr] size=   86016kB time=00:45:53.83 bitrate= 255.9kbits/s speed= 270x
dialoqbase           | [fferr] size=   90624kB time=00:48:22.14 bitrate= 255.8kbits/s speed= 271x
dialoqbase           | [fferr] size=   95488kB time=00:50:56.18 bitrate= 256.0kbits/s speed= 273x
dialoqbase           | [fferr] size=  100096kB time=00:53:23.09 bitrate= 256.0kbits/s speed= 274x
dialoqbase           | [fferr] size=  104448kB time=00:55:49.94 bitrate= 255.4kbits/s speed= 274x
dialoqbase           | [fferr] size=  109568kB time=00:58:30.53 bitrate= 255.7kbits/s speed= 276x
dialoqbase           | [fferr] size=  114176kB time=01:00:57.81 bitrate= 255.7kbits/s speed= 277x
dialoqbase           | [fferr] size=  119040kB time=01:03:29.76 bitrate= 256.0kbits/s speed= 278x
dialoqbase           | [fferr] size=  123904kB time=01:06:05.10 bitrate= 256.0kbits/s speed= 279x
dialoqbase           | [fferr] size=  128256kB time=01:08:27.46 bitrate= 255.8kbits/s speed= 279x
dialoqbase           | [fferr] size=  133120kB time=01:11:01.76 bitrate= 255.9kbits/s speed= 280x
dialoqbase           | [fferr] size=  137984kB time=01:13:38.64 bitrate= 255.8kbits/s speed= 281x
dialoqbase           | [fferr] size=  142336kB time=01:15:54.77 bitrate= 256.0kbits/s speed= 281x
dialoqbase           | [fferr] size=  146944kB time=01:18:28.28 bitrate= 255.7kbits/s speed= 282x
dialoqbase           | [fferr] size=  151552kB time=01:20:55.43 bitrate= 255.7kbits/s speed= 282x
dialoqbase           | [fferr] size=  156672kB time=01:23:33.79 bitrate= 256.0kbits/s speed= 283x
dialoqbase           | [fferr] size=  160768kB time=01:25:52.18 bitrate= 255.6kbits/s speed= 283x
dialoqbase           | [fferr] size=  165632kB time=01:28:26.82 bitrate= 255.7kbits/s speed= 283x
dialoqbase           | [fferr] size=  170496kB time=01:30:58.68 bitrate= 255.9kbits/s speed= 284x
dialoqbase           | [fferr] size=  175360kB time=01:33:32.23 bitrate= 256.0kbits/s speed= 285x
dialoqbase           | [fferr] size=  179968kB time=01:35:59.01 bitrate= 256.0kbits/s speed= 284x
dialoqbase           | [fferr] size=  184576kB time=01:38:31.07 bitrate= 255.8kbits/s speed= 285x
dialoqbase           | [fferr] size=  189440kB time=01:41:06.93 bitrate= 255.8kbits/s speed= 285x
dialoqbase           | [fferr] size=  194304kB time=01:43:39.04 bitrate= 255.9kbits/s speed= 286x
dialoqbase           | [fferr] size=  199168kB time=01:46:13.99 bitrate= 256.0kbits/s speed= 286x
dialoqbase           | [fferr] size=  203008kB time=01:48:24.30 bitrate= 255.7kbits/s speed= 286x
dialoqbase           | [fferr] size=  207872kB time=01:50:59.96 bitrate= 255.7kbits/s speed= 286x
dialoqbase           | [fferr] size=  212992kB time=01:53:39.23 bitrate= 255.9kbits/s speed= 287x
dialoqbase           | [fferr] size=  217856kB time=01:56:14.01 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  222720kB time=01:58:49.59 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  227328kB time=02:01:17.10 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  230656kB time=02:03:02.94 bitrate= 255.9kbits/s speed= 287x
dialoqbase           | [fferr] size=  234752kB time=02:05:18.85 bitrate= 255.8kbits/s speed= 286x
dialoqbase           | [fferr] size=  239616kB time=02:07:48.41 bitrate= 256.0kbits/s speed= 287x
dialoqbase           | [fferr] size=  244224kB time=02:10:20.31 bitrate= 255.8kbits/s speed= 287x
dialoqbase           | [fferr] size=  249088kB time=02:12:53.40 bitrate= 255.9kbits/s speed= 287x
dialoqbase           | [fferr] size=  253952kB time=02:15:28.05 bitrate= 256.0kbits/s speed= 288x
dialoqbase           | [fferr] size=  257792kB time=02:17:31.34 bitrate= 255.9kbits/s speed= 287x
dialoqbase           | [fferr] size=  262656kB time=02:20:06.06 bitrate= 256.0kbits/s speed= 287x
dialoqbase           | [fferr] size=  267264kB time=02:22:33.44 bitrate= 256.0kbits/s speed= 287x
dialoqbase           | [fferr] size=  271872kB time=02:25:02.83 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  276736kB time=02:27:37.41 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  281344kB time=02:30:05.88 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  285952kB time=02:32:37.15 bitrate= 255.8kbits/s speed= 288x
dialoqbase           | [fferr] size=  289536kB time=02:34:31.44 bitrate= 255.8kbits/s speed= 287x
dialoqbase           | [fferr] size=  294400kB time=02:37:01.56 bitrate= 256.0kbits/s speed= 288x
dialoqbase           | [fferr] size=  299008kB time=02:39:30.98 bitrate= 255.9kbits/s speed= 288x
dialoqbase           | [fferr] size=  302280kB time=02:41:12.94 bitrate= 256.0kbits/s speed= 288x
dialoqbase           | [fferr] video:0kB audio:302280kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000025%
dialoqbase           | [ffout] FFMPEG_END
dialoqbase           | [info] run FS.readFile ./audio.wav
dialoqbase           | [Program terminated with exit(0)] undefined
n4ze3m commented 9 months ago

Yep, I download YouTube videos as MP3 files and then convert them to WAV files. I do the same for MP3 and MP4 formats using FFmpeg WebAssembly to use the Transformer.js Whisper instead of the OpenAI API, which is really expensive but fast. According to the OpenAI Whisper API docs, we can only upload files up to 25MB. I will address this issue in upcoming updates. Thanks for the feedback. I tested it with 3 to 5-minute videos, which take around 7 minutes to process on my local PC

n4ze3m commented 9 months ago

dialoqbase | [ffout] FFMPEG_END dialoqbase | [info] run FS.readFile ./audio.wav dialoqbase | [Program terminated with exit(0)] undefined

This is not an error, I guess FFmpeg finished processing

mjtechguy commented 9 months ago

@n4ze3m you are correct, it actually does appear to complete eventually. This appears to be a CPU intensive task? I am only seeing a single thread being used for processing on my 16 core system.

n4ze3m commented 9 months ago

🤔 Yes, using the Whisper model locally takes up a nice amount of CPU. I need to check why it only utilizes a single thread for processing, and I'm updating the current queue logic, which will make the application a little faster than the current method.

n4ze3m commented 9 months ago

Hey, multi-user support has been added, and concurrency queue processing has been implemented. You can use these links for more details:

mjtechguy commented 9 months ago

@n4ze3m these new updates are working great in 1.0.1. I have tested long youtube videos, webcrawler, website, pdf and more. Thanks for this!

mjtechguy commented 9 months ago

A few questions about some of the loaders. Should these be separate "issues"?

Thanks!

n4ze3m commented 9 months ago

Thanks for the suggestions. I will look into it :)

n4ze3m commented 8 months ago

Hey, a new version has been released which supports 'RAG.' You can enable it on the bot's settings page

mjtechguy commented 8 months ago

awesome sir! i will test it this evening

mjtechguy commented 8 months ago

This appears to be working well. I will keep testing. Great work sir!

n4ze3m commented 8 months ago

Hey, the latest version has been released, which supports custom or local AI models that are compatible with the OpenAI API. For more details, check out this link: https://dialoqbase.n4ze3m.com/guide/localai-model.html.

mjtechguy commented 8 months ago

Excellent! Will try this weekend

n4ze3m commented 8 months ago

Hey, new version has been released, which now allows you to add an API key to custom models. For example, you can use OpenRouter models

n4ze3m commented 8 months ago

Hey, a new version has been released with improved default RAG support, along with support for new OpenAI models, and faster video and audio processing using DistilWhisper