Closed bdiu29 closed 11 months ago
Okay, so it looks like I need to create unique titles for each chunk.
@bdiu29 thats right, both passage and title
@bdiu29 here is a toy example https://github.com/arcee-ai/DALM/blob/e6c3d293e7c75a43a6cfb3d78681969c1219e8c8/dalm/datasets/toy_data_train.csv
Thanks! I figured it out. I used my text chunks in the 'Passage' column and passed them into Llama2 to generate unique 'Title' values for each chunk.
I think I tried running the dalm qa-gen
on toy_data_train.csv and it was giving me the missing 'Title' column error as well.
I'm trying to generate triplets using
dalm qa-gen
on my local CSV file to that has a column called 'Passage' which contains my chunked texts. Apparently it's expecting a title column? Any chance you guys have an example of how the input data should be formatted for ingestion? Thanks!