-
To implement a common black box we need text loading, extraction of words to be attacked, perturbations, distance metrics, models.
Text Loading needs to be very uniform and universal, it should enc…
-
helpers.py module is not accessible if it's in the "text-analysis/code" directory unless either...
1) the code directory is added to sys path e.g. `sys.path.insert(0, '/content/drive/My Drive/Colab…
-
As an admin, I want to be able to parse and preprocess raw text data so I can feed it into a machine learning model for sentiment analysis.
-
I am working on a project that involves providing LightRAG with hundreds of PDFs for queries. I want to ensure that the data is processed efficiently and accurately.
1. What is the optimal format fo…
-
Hi
Thanks for your awesome work!
I tried the Arabic TTS voice (Kareem), and I noticed that an important text preprocessing step is missing.
Arabic text is usually unvocalized (aka diacritized…
-
Currently there are three separate files for preprocessing the documentation. Each file has slightly different functionality and requires specific instructions to ensure the preprocessing is done corr…
-
This issue has to do with using another iOS app's share functionality to share to my react native app running expo share extension. The specific app in question is Home Depot. How to reproduce:
1. …
-
### GitHub Issue: Extend Vector Database with Public and Indico Pages for EIC Information
**Issue Title:**
Extend the Vector Database to Include Information from Public and Indico Pages for EIC
…
-
- coming out of #2728
If there is a part definition that is based on a record type that is unknown in the project's dsd, the part should be skippable with the option `drop_on_unknown_record_type: …
-
## Dataset Format
The pre-processing script expects data to be a directory with:
* `metadata.csv` - CSV file with text, audio filenames, and speaker names
* `wav/` - directory with audio files
The …