-
- [x] Create `preprocess.py` script to separate data preparation from training
-
This is more of a question than an issue, but It seems like something that could be useful to includes in the README.md or future docs.
I'm wondering if there are any recommendations for how to prepa…
-
Preprocessing steps to do for each subject
-
Preprocessing steps to do once subjects have been concat
-
This part involves
1. researching and extracting stop words for example
```[“document”, “story”, “machine translation”, “translation”, “figure”] -> [“machine translation”]```, performing NLP analy…
-
As pointed out in the review by @flxst in the PR https://github.com/Modalities/modalities/pull/164, the class inheritance of the dataset classes can be improved:
> 1. I think that the class inheri…
-
Thanks for your sharing code.
I am wondering if there are some preprocessing steps involved in your methods.
Could you share the code about how to generate the ATLAS.h5 from raw data (229 cases)?
…
-
catboost version: ai.catboost:catboost-spark_3.0_2.12:1.1
data: (7m, 450) with **90** category features
unique value of cat features: from 10 to 5k
saprkconf:
driver 64g 32cores
executor 64g …
-
The current workflow of `eval_fewshot.py` is:
1. Generate "source", which contains the example QAs and the Question we ask.
2. Concatenate "source" and "target", where target is the option we want…
-
Hi,
I'm trying to load a custom dataset without removing the punctuation. However, if I set remove_punctuation = False, still all punctuation is removed and even worse; words connected to any punctua…