doc2bow error when running lda optimizer described in your docs

OCTIS version: 1.13.1
Python version: 3.10.14
Operating System: macos 14.1.1
Description

I'm trying to test to optimizer described in your docs, and following the steps exactly (except for changing dataset.load to dataset.fetch_dataset) but I get the following error TypeError: doc2bow expects an array of unicode tokens on input, not a single string
What I Did

from skopt.space.space import Real
from octis.evaluation_metrics.coherence_metrics import Coherence
from octis.models.LDA import LDA
from octis.optimization.optimizer import Optimizer

optimizer = Optimizer()

model = LDA()
model.hyperparameters.update({"num_topics": 20})

dataset = Dataset()
dataset.fetch_dataset("M10")

metric_parameters = {
    'texts': dataset.get_corpus(),
    'topk': 10,
    'measure': 'c_npmi'
}
npmi = Coherence(metric_parameters)

search_space = {
    "alpha": Real(low=0.001, high=5.0),
    "eta": Real(low=0.001, high=5.0)
}

optimization_result = optimizer.optimize(model,
                                         dataset,
                                         npmi,
                                         search_space,
                                         number_of_call=10,
                                         n_random_starts=3,
                                         model_runs=3,
                                         save_name="result",
                                         surrogate_model="RF",
                                         acq_func="LCB"
                                         )

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[11], [line 16](vscode-notebook-cell:?execution_count=11&line=16)
      [9](vscode-notebook-cell:?execution_count=11&line=9) model.hyperparameters.update({"num_topics": 20})
     [11](vscode-notebook-cell:?execution_count=11&line=11) metric_parameters = {
     [12](vscode-notebook-cell:?execution_count=11&line=12)     'texts': dataset.get_corpus(),
     [13](vscode-notebook-cell:?execution_count=11&line=13)     'topk': 10,
     [14](vscode-notebook-cell:?execution_count=11&line=14)     'measure': 'c_npmi'
     [15](vscode-notebook-cell:?execution_count=11&line=15) }
---> [16](vscode-notebook-cell:?execution_count=11&line=16) npmi = Coherence(metric_parameters)
     [18](vscode-notebook-cell:?execution_count=11&line=18) search_space = {
     [19](vscode-notebook-cell:?execution_count=11&line=19)     "alpha": Real(low=0.001, high=5.0),
     [20](vscode-notebook-cell:?execution_count=11&line=20)     "eta": Real(low=0.001, high=5.0)
     [21](vscode-notebook-cell:?execution_count=11&line=21) }
     [23](vscode-notebook-cell:?execution_count=11&line=23) optimization_result = optimizer.optimize(model,
     [24](vscode-notebook-cell:?execution_count=11&line=24)                                          dataset,
     [25](vscode-notebook-cell:?execution_count=11&line=25)                                          npmi,
   (...)
     [32](vscode-notebook-cell:?execution_count=11&line=32)                                          acq_func="LCB"
     [33](vscode-notebook-cell:?execution_count=11&line=33)                                          )

File [/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:34](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:34), in Coherence.__init__(self, texts, topk, processes, measure)
     [32](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:32) else:
     [33](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:33)     self._texts = texts
---> [34](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:34) self._dictionary = Dictionary(self._texts)
     [35](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:35) self.topk = topk
     [36](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/octis/evaluation_metrics/coherence_metrics.py:36) self.processes = processes

File [/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:78](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:78), in Dictionary.__init__(self, documents, prune_at)
     [75](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:75) self.num_nnz = 0
     [77](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:77) if documents is not None:
---> [78](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:78)     self.add_documents(documents, prune_at=prune_at)
     [79](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:79)     self.add_lifecycle_event(
     [80](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:80)         "created",
     [81](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:81)         msg=f"built {self} from {self.num_docs} documents (total {self.num_pos} corpus positions)",
     [82](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:82)     )

File [/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:204](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:204), in Dictionary.add_documents(self, documents, prune_at)
    [201](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:201)         logger.info("adding document #%i to %s", docno, self)
    [203](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:203)     # update Dictionary with the document
--> [204](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:204)     self.doc2bow(document, allow_update=True)  # ignore the result, here we only care about updating token ids
    [206](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:206) logger.info("built %s from %i documents (total %i corpus positions)", self, self.num_docs, self.num_pos)

File [/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:241](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:241), in Dictionary.doc2bow(self, document, allow_update, return_missing)
    [209](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:209) """Convert `document` into the bag-of-words (BoW) format = list of `(token_id, token_count)` tuples.
    [210](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:210) 
    [211](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:211) Parameters
   (...)
    [238](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:238) 
    [239](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:239) """
    [240](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:240) if isinstance(document, str):
--> [241](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:241)     raise TypeError("doc2bow expects an array of unicode tokens on input, not a single string")
    [243](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:243) # Construct (word, frequency) mapping.
    [244](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/octis/lib/python3.10/site-packages/gensim/corpora/dictionary.py:244) counter = defaultdict(int)

TypeError: doc2bow expects an array of unicode tokens on input, not a single string
MIND-Lab / OCTIS

doc2bow error when running lda optimizer described in your docs #121

Description

What I Did