-
Hello,
I am not able to see where the cue generation is being done. In which file is this being done? Also, I'm not able to access Google Drive for the dataset. It says that the URL does not exist…
-
Hi!
I've noticed that the generation code of hard preference keywords and target keywords is not integrated into this repo. I tried to generate them using the prompts in the file `prompts.py` on t…
-
Can you share your dataset generation script for symbolic SQL data? I found some invalid SQL and wanted to improve it.
There are spaces in table column names, which is invalid, as shown in the exam…
-
Can you give me instructions on how to train the network on a custom dataset? It contains 512x512 RGB images with labels in the form of timestamps of when they were taken, in differences of minutes. I…
-
- Write shell script to get UA wikipedia data
- Update `.gitignore` to not upload bulk data to repo
- Write module to take in a `model` parameter and raw text data to generate prompt-based dataset f…
-
Hello! Thank you for your amazing work and open-source contributions. Generating the large training dataset is taking me a considerable amount of time (without parallel processing) and storage space. …
-
# Alex Strick van Linschoten - How to think about creating a dataset for LLM finetuning evaluation
I summarise the kinds of evaluations that are needed for a structured data generation task.
[https:…
-
Dr. Bai,
Good evening.
I've debugged the MMGR codes based on the released mmgrdataset. Now I want to train and test on dataset in another city.
According to your reference, I visited the required …
-
So there are a few options for generating propellant datasets:
A) We could scrape CEAWeb; but for each Mixture Ratio+exit pressure combination we'd need 60 plus data points (chamber pressure). So a…
-
the current readme has instructions for reproducing the evaluation results and how to run inference.
I cannot find instructions to reproduce the dataset generation and fine-tuning steps. Please add…