Closed a7217339 closed 6 days ago
Please don't use image when referring to code next time. Use permalink to code.
train_dataset should be in the form of dict after processing the map
No, map
returns a Dataset
instance (see datasets.map
documentation). Unless you remove these columns (prompt, completion) from the dataset, they remain.
Thank you for your guidance. As a beginner, I am not yet proficient in grammar. Sorry.
I have a code comprehension problem: train_dataset should be in the form of dict after processing the map with _tokenize( prompt_input_ids=prompt_input_ids, prompt_attention_mask=prompt_attention_mask, answer_input_ids=answer_input_ids, answer_attention_mask=answer_attention_mask, ) How can I still access 'keys' like 'prompt' in the map of process_token? ( prompt = example["prompt"] completion = example["completion"])