-
**Motivation**
Initially, [docusaurus.io](https://docusaurus.io/) has been chosen as a tool to visualize Texture's documentation on the website.
Docusaurus is great for two reasons: it's very beau…
-
(Edit)
As it is the case with Pandas, every `texthero` function should deal with not assigned (NA) values.
The rule of thumb should be the following:
> If the given input Pandas Series `s` ha…
-
we need to create a dedicated script. This script will be responsible for importing data from the lucknow_llm package, performing necessary transformations, and storing the processed data in the datab…
-
The following contains a high-level view of what will be the next main enhancement steps. This document will be kept up-to-date and improved frequently. This work will be mainly conducted by @mk2510 a…
-
Hi, I was trying to run dbscan on some texts and create a scatterplot.
I wonder why my `dbscan_labels` has a -1 category (not sure what it means):
```
documents['dbscan_labels'] = (
docume…
-
**Goal**
Implement topic modeling on Texthero.
**Topic modeling**
There are mainly two ways to do topic modeling: LSA/LSI (latent semantic indexing) and LDA (Latent Dirichlet allocation). This [s…
-
Task: write the "Getting started: preprocessing" doc page
## Advice/Tips to the technical writer
Good to know:
- This page appears after the "1. Getting started". The users reading this p…
-
Processamento de texto
-
```python
>>> import texthero as hero
>>> import pandas as pd
>>> import string
>>>
>>> s = pd.Series(rf"{string.punctuation}")
>>> hero.remove_punctuation(s)
0 \
dtype: object
```
-
It seems texthero supports spaCy 2.3.7 only, not the latest version of spaCy 3. Any plans to update the package to make it compatible with spaCy 3?