-
### Feature request
This feature request aims to improve the speed of Whisper's batched version by adding a VAD model (such as pyannote or from NeMO or Silero) and merging chunks up to 30 sec, instea…
-
depending on the file type (filename extension), we could use different chunkers.
e.g. for uncompressed tar files, we could split at file boundaries to better deduplicate different tars that share a …
-
### What happened?
When creating a new Vectorizer the function call fails, since it is trying to use a dropped column.
Column being pulled that shouldn't be is `pg.dropped.34`. For example: `sele…
-
See figma design [here](https://www.figma.com/design/QOxHwppG64GtPocpDhS6Y7/Product?node-id=8230-465080&node-type=canvas&t=b7Kveow09GDxp5Nv-0)
We've significantly improved our file coverage in the in…
-
### Is there an existing issue for the same feature request?
- [X] I have checked the existing issues.
### Is your feature request related to a problem?
```Markdown
The context accuracy of chunk di…
-
Should we save the options passed to the converter? I guess some of them are intrinsically stored in the zarr (all the chunking options and stuff), but maybe the full command should be also stored non…
-
### Package
filament/filament
### Package Version
v3.2.116
### Laravel Version
v11.29.0
### Livewire Version
v3.5.12
### PHP Version
PHP 8.2.22
### Problem description
…
-
# Description
Strategize a suitable chunking technique to index the given environment clearance data, where each file contains a list of projects and their details. Additionally, implement a retrie…
-
Hello,
I'm looking for an optimal chunking strategy to get relevant answers for my queries.
I tried these parmeters provided in "high quality config" :
- chunk size : 7000
- chunk overlap : 2…
-
Our code passes in the server context size passed in from the CLI to the document chunking method, but then it's just ignored. We should do something with it presumably!
I noticed this when reviewi…