EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
6.73k stars 1.79k forks source link

I encountered the following problems when I tried to specify the Ceval dataset using Ceval-valid-*.I don't understand why #713

Closed Halflifefa closed 1 year ago

Halflifefa commented 1 year ago

python main.py \

--model hf-causal-experimental \
--model_args pretrained=../LLaMA-Efficient-Tuning/model/Baichuan-13B-Instruction \
--tasks Ceval-valid-* \
--device cuda:0

usage: main.py [-h] --model MODEL [--model_args MODEL_ARGS] [--tasks {Ceval-valid-accountant,Ceval-valid-advanced_mathematics,Ceval-valid-art_studies,Ceval-valid-basic_medicine,Ceval-valid-business_administration,Ceval-valid-chinese_language_and_literature,Ceval-valid-civil_servant,Ceval-valid-clinical_medicine,Ceval-valid-college_chemistry,Ceval-valid-college_economics,Ceval-valid-college_physics,Ceval-valid-college_programming,Ceval-valid-computer_architecture,Ceval-valid-computer_network,Ceval-valid-discrete_mathematics,Ceval-valid-education_science,Ceval-valid-electrical_engineer,Ceval-valid-environmental_impact_assessment_engineer,Ceval-valid-fire_engineer,Ceval-valid-high_school_biology,Ceval-valid-high_school_chemistry,Ceval-valid-high_school_chinese,Ceval-valid-high_school_geography,Ceval-valid-high_school_history,Ceval-valid-high_school_mathematics,Ceval-valid-high_school_physics,Ceval-valid-high_school_politics,Ceval-valid-ideological_and_moral_cultivation,Ceval-valid-law,Ceval-valid-legal_professional,Ceval-valid-logic,Ceval-valid-mao_zedong_thought,Ceval-valid-marxism,Ceval-valid-metrology_engineer,Ceval-valid-middle_school_biology,Ceval-valid-middle_school_chemistry,Ceval-valid-middle_school_geography,Ceval-valid-middle_school_history,Ceval-valid-middle_school_mathematics,Ceval-valid-middle_school_physics,Ceval-valid-middle_school_politics,Ceval-valid-modern_chinese_history,Ceval-valid-operating_system,Ceval-valid-physician,Ceval-valid-plant_protection,Ceval-valid-probability_and_statistics,Ceval-valid-professional_tour_guide,Ceval-valid-sports_science,Ceval-valid-tax_accountant,Ceval-valid-teacher_qualification,Ceval-valid-urban_and_rural_planner,Ceval-valid-veterinary_medicine,anagrams1,anagrams2,anli_r1,anli_r2,anli_r3,arc_challenge,arc_easy,arithmetic_1dc,arithmetic_2da,arithmetic_2dm,arithmetic_2ds,arithmetic_3da,arithmetic_3ds,arithmetic_4da,arithmetic_4ds,arithmetic_5da,arithmetic_5ds,babi,bigbench_init,bigbench_causal_judgement,bigbench_date_understanding,bigbench_disambiguation_qa,bigbench_dyck_languages,bigbench_formal_fallacies_syllogisms_negation,bigbench_geometric_shapes,bigbench_hyperbaton,bigbench_logical_deduction_five_objects,bigbench_logical_deduction_seven_objects,bigbench_logical_deduction_three_objects,bigbench_movie_recommendation,bigbench_navigate,bigbench_reasoning_about_colored_objects,bigbench_ruin_names,bigbench_salient_translation_error_detection,bigbench_snarks,bigbench_sports_understanding,bigbench_temporal_sequences,bigbench_tracking_shuffled_objects_five_objects,bigbench_tracking_shuffled_objects_seven_objects,bigbench_tracking_shuffled_objects_three_objects,blimp_adjunct_island,blimp_anaphor_gender_agreement,blimp_anaphor_number_agreement,blimp_animate_subject_passive,blimp_animate_subject_trans,blimp_causative,blimp_complex_NP_island,blimp_coordinate_structure_constraint_complex_left_branch,blimp_coordinate_structure_constraint_object_extraction,blimp_determiner_noun_agreement_1,blimp_determiner_noun_agreement_2,blimp_determiner_noun_agreement_irregular_1,blimp_determiner_noun_agreement_irregular_2,blimp_determiner_noun_agreement_with_adj_2,blimp_determiner_noun_agreement_with_adj_irregular_1,blimp_determiner_noun_agreement_with_adj_irregular_2,blimp_determiner_noun_agreement_with_adjective_1,blimp_distractor_agreement_relational_noun,blimp_distractor_agreement_relative_clause,blimp_drop_argument,blimp_ellipsis_n_bar_1,blimp_ellipsis_n_bar_2,blimp_existential_there_object_raising,blimp_existential_there_quantifiers_1,blimp_existential_there_quantifiers_2,blimp_existential_there_subject_raising,blimp_expletive_it_object_raising,blimp_inchoative,blimp_intransitive,blimp_irregular_past_participle_adjectives,blimp_irregular_past_participle_verbs,blimp_irregular_plural_subject_verb_agreement_1,blimp_irregular_plural_subject_verb_agreement_2,blimp_left_branch_island_echo_question,blimp_left_branch_island_simple_question,blimp_matrix_question_npi_licensor_present,blimp_npi_present_1,blimp_npi_present_2,blimp_only_npi_licensor_present,blimp_only_npi_scope,blimp_passive_1,blimp_passive_2,blimp_principle_A_c_command,blimp_principle_A_case_1,blimp_principle_A_case_2,blimp_principle_A_domain_1,blimp_principle_A_domain_2,blimp_principle_A_domain_3,blimp_principle_A_reconstruction,blimp_regular_plural_subject_verb_agreement_1,blimp_regular_plural_subject_verb_agreement_2,blimp_sentential_negation_npi_licensor_present,blimp_sentential_negation_npi_scope,blimp_sentential_subject_island,blimp_superlative_quantifiers_1,blimp_superlative_quantifiers_2,blimp_tough_vs_raising_1,blimp_tough_vs_raising_2,blimp_transitive,blimp_wh_island,blimp_wh_questions_object_gap,blimp_wh_questions_subject_gap,blimp_wh_questions_subject_gap_long_distance,blimp_wh_vs_that_no_gap,blimp_wh_vs_that_no_gap_long_distance,blimp_wh_vs_that_with_gap,blimp_wh_vs_that_with_gap_long_distance,boolq,cb,cola,copa,coqa,crows_pairs_english,crows_pairs_english_age,crows_pairs_english_autre,crows_pairs_english_disability,crows_pairs_english_gender,crows_pairs_english_nationality,crows_pairs_english_physical_appearance,crows_pairs_english_race_color,crows_pairs_english_religion,crows_pairs_english_sexual_orientation,crows_pairs_english_socioeconomic,crows_pairs_french,crows_pairs_french_age,crows_pairs_french_autre,crows_pairs_french_disability,crows_pairs_french_gender,crows_pairs_french_nationality,crows_pairs_french_physical_appearance,crows_pairs_french_race_color,crows_pairs_french_religion,crows_pairs_french_sexual_orientation,crows_pairs_french_socioeconomic,csatqa_gr,csatqa_li,csatqa_rch,csatqa_rcs,csatqa_rcss,csatqa_wr,cycle_letters,drop,ethics_cm,ethics_deontology,ethics_justice,ethics_utilitarianism,ethics_utilitarianism_original,ethics_virtue,gsm8k,headqa,headqa_en,headqa_es,hellaswag,hendrycksTest-abstract_algebra,hendrycksTest-anatomy,hendrycksTest-astronomy,hendrycksTest-business_ethics,hendrycksTest-clinical_knowledge,hendrycksTest-college_biology,hendrycksTest-college_chemistry,hendrycksTest-college_computer_science,hendrycksTest-college_mathematics,hendrycksTest-college_medicine,hendrycksTest-college_physics,hendrycksTest-computer_security,hendrycksTest-conceptual_physics,hendrycksTest-econometrics,hendrycksTest-electrical_engineering,hendrycksTest-elementary_mathematics,hendrycksTest-formal_logic,hendrycksTest-global_facts,hendrycksTest-high_school_biology,hendrycksTest-high_school_chemistry,hendrycksTest-high_school_computer_science,hendrycksTest-high_school_european_history,hendrycksTest-high_school_geography,hendrycksTest-high_school_government_and_politics,hendrycksTest-high_school_macroeconomics,hendrycksTest-high_school_mathematics,hendrycksTest-high_school_microeconomics,hendrycksTest-high_school_physics,hendrycksTest-high_school_psychology,hendrycksTest-high_school_statistics,hendrycksTest-high_school_us_history,hendrycksTest-high_school_world_history,hendrycksTest-human_aging,hendrycksTest-human_sexuality,hendrycksTest-international_law,hendrycksTest-jurisprudence,hendrycksTest-logical_fallacies,hendrycksTest-machine_learning,hendrycksTest-management,hendrycksTest-marketing,hendrycksTest-medical_genetics,hendrycksTest-miscellaneous,hendrycksTest-moral_disputes,hendrycksTest-moral_scenarios,hendrycksTest-nutrition,hendrycksTest-philosophy,hendrycksTest-prehistory,hendrycksTest-professional_accounting,hendrycksTest-professional_law,hendrycksTest-professional_medicine,hendrycksTest-professional_psychology,hendrycksTest-public_relations,hendrycksTest-security_studies,hendrycksTest-sociology,hendrycksTest-us_foreign_policy,hendrycksTest-virology,hendrycksTest-world_religions,iwslt17-ar-en,iwslt17-en-ar,lambada_openai,lambada_openai_cloze,lambada_openai_mt_de,lambada_openai_mt_en,lambada_openai_mt_es,lambada_openai_mt_fr,lambada_openai_mt_it,lambada_standard,lambada_standard_cloze,logiqa,math_algebra,math_asdiv,math_counting_and_prob,math_geometry,math_intermediate_algebra,math_num_theory,math_prealgebra,math_precalc,mathqa,mc_taco,mgsm_bn,mgsm_de,mgsm_en,mgsm_es,mgsm_fr,mgsm_ja,mgsm_ru,mgsm_sw,mgsm_te,mgsm_th,mgsm_zh,mnli,mnli_mismatched,mrpc,multirc,mutual,mutual_plus,openbookqa,pawsx_de,pawsx_en,pawsx_es,pawsx_fr,pawsx_ja,pawsx_ko,pawsx_zh,pile_arxiv,pile_bookcorpus2,pile_books3,pile_dm-mathematics,pile_enron,pile_europarl,pile_freelaw,pile_github,pile_gutenberg,pile_hackernews,pile_nih-exporter,pile_opensubtitles,pile_openwebtext2,pile_philpapers,pile_pile-cc,pile_pubmed-abstracts,pile_pubmed-central,pile_stackexchange,pile_ubuntu-irc,pile_uspto,pile_wikipedia,pile_youtubesubtitles,piqa,prost,pubmedqa,qa4mre_2011,qa4mre_2012,qa4mre_2013,qasper,qnli,qqp,race,random_insertion,record,reversed_words,rte,sciq,scrolls_contractnli,scrolls_govreport,scrolls_narrativeqa,scrolls_qasper,scrolls_qmsum,scrolls_quality,scrolls_summscreenfd,squad2,sst,swag,toxigen,triviaqa,truthfulqa_gen,truthfulqa_mc,webqs,wic,wikitext,winogrande,wmt14-en-fr,wmt14-fr-en,wmt16-de-en,wmt16-en-de,wmt16-en-ro,wmt16-ro-en,wmt20-cs-en,wmt20-de-en,wmt20-de-fr,wmt20-en-cs,wmt20-en-de,wmt20-en-iu,wmt20-en-ja,wmt20-en-km,wmt20-en-pl,wmt20-en-ps,wmt20-en-ru,wmt20-en-ta,wmt20-en-zh,wmt20-fr-de,wmt20-iu-en,wmt20-ja-en,wmt20-km-en,wmt20-pl-en,wmt20-ps-en,wmt20-ru-en,wmt20-ta-en,wmt20-zh-en,wnli,wsc,wsc273,xcopa_et,xcopa_ht,xcopa_id,xcopa_it,xcopa_qu,xcopa_sw,xcopa_ta,xcopa_th,xcopa_tr,xcopa_vi,xcopa_zh,xnli_ar,xnli_bg,xnli_de,xnli_el,xnli_en,xnli_es,xnli_fr,xnli_hi,xnli_ru,xnli_sw,xnli_th,xnli_tr,xnli_ur,xnli_vi,xnli_zh,xstory_cloze_ar,xstory_cloze_en,xstory_cloze_es,xstory_cloze_eu,xstory_cloze_hi,xstory_cloze_id,xstory_cloze_my,xstory_cloze_ru,xstory_cloze_sw,xstory_cloze_te,xstory_cloze_zh,xwinograd_en,xwinograd_fr,xwinograd_jp,xwinograd_pt,xwinograd_ru,xwinograd_zh}] [--provide_description] [--num_fewshot NUM_FEWSHOT] [--batch_size BATCH_SIZE] [--max_batch_size MAX_BATCH_SIZE] [--device DEVICE] [--output_path OUTPUT_PATH] [--limit LIMIT] [--data_sampling DATA_SAMPLING] [--no_cache] [--decontamination_ngrams_path DECONTAMINATION_NGRAMS_PATH] [--description_dict_path DESCRIPTION_DICT_PATH] [--check_integrity] [--write_out] [--output_base_path OUTPUT_BASE_PATH] main.py: error: argument --tasks: invalid choice: 'Ceval-valid-electrical_engineer_write_out_info.json' (choose from 'Ceval-valid-accountant', 'Ceval-valid-advanced_mathematics', 'Ceval-valid-art_studies', 'Ceval-valid-basic_medicine', 'Ceval-valid-business_administration', 'Ceval-valid-chinese_language_and_literature', 'Ceval-valid-civil_servant', 'Ceval-valid-clinical_medicine', 'Ceval-valid-college_chemistry', 'Ceval-valid-college_economics', 'Ceval-valid-college_physics', 'Ceval-valid-college_programming', 'Ceval-valid-computer_architecture', 'Ceval-valid-computer_network', 'Ceval-valid-discrete_mathematics', 'Ceval-valid-education_science', 'Ceval-valid-electrical_engineer', 'Ceval-valid-environmental_impact_assessment_engineer', 'Ceval-valid-fire_engineer', 'Ceval-valid-high_school_biology', 'Ceval-valid-high_school_chemistry', 'Ceval-valid-high_school_chinese', 'Ceval-valid-high_school_geography', 'Ceval-valid-high_school_history', 'Ceval-valid-high_school_mathematics', 'Ceval-valid-high_school_physics', 'Ceval-valid-high_school_politics', 'Ceval-valid-ideological_and_moral_cultivation', 'Ceval-valid-law', 'Ceval-valid-legal_professional', 'Ceval-valid-logic', 'Ceval-valid-mao_zedong_thought', 'Ceval-valid-marxism', 'Ceval-valid-metrology_engineer', 'Ceval-valid-middle_school_biology', 'Ceval-valid-middle_school_chemistry', 'Ceval-valid-middle_school_geography', 'Ceval-valid-middle_school_history', 'Ceval-valid-middle_school_mathematics', 'Ceval-valid-middle_school_physics', 'Ceval-valid-middle_school_politics', 'Ceval-valid-modern_chinese_history', 'Ceval-valid-operating_system', 'Ceval-valid-physician', 'Ceval-valid-plant_protection', 'Ceval-valid-probability_and_statistics', 'Ceval-valid-professional_tour_guide', 'Ceval-valid-sports_science', 'Ceval-valid-tax_accountant', 'Ceval-valid-teacher_qualification', 'Ceval-valid-urban_and_rural_planner', 'Ceval-valid-veterinary_medicine', 'anagrams1', 'anagrams2', 'anli_r1', 'anli_r2', 'anli_r3', 'arc_challenge', 'arc_easy', 'arithmetic_1dc', 'arithmetic_2da', 'arithmetic_2dm', 'arithmetic_2ds', 'arithmetic_3da', 'arithmetic_3ds', 'arithmetic_4da', 'arithmetic_4ds', 'arithmetic_5da', 'arithmetic_5ds', 'babi', 'bigbench_init', 'bigbench_causal_judgement', 'bigbench_date_understanding', 'bigbench_disambiguation_qa', 'bigbench_dyck_languages', 'bigbench_formal_fallacies_syllogisms_negation', 'bigbench_geometric_shapes', 'bigbench_hyperbaton', 'bigbench_logical_deduction_five_objects', 'bigbench_logical_deduction_seven_objects', 'bigbench_logical_deduction_three_objects', 'bigbench_movie_recommendation', 'bigbench_navigate', 'bigbench_reasoning_about_colored_objects', 'bigbench_ruin_names', 'bigbench_salient_translation_error_detection', 'bigbench_snarks', 'bigbench_sports_understanding', 'bigbench_temporal_sequences', 'bigbench_tracking_shuffled_objects_five_objects', 'bigbench_tracking_shuffled_objects_seven_objects', 'bigbench_tracking_shuffled_objects_three_objects', 'blimp_adjunct_island', 'blimp_anaphor_gender_agreement', 'blimp_anaphor_number_agreement', 'blimp_animate_subject_passive', 'blimp_animate_subject_trans', 'blimp_causative', 'blimp_complex_NP_island', 'blimp_coordinate_structure_constraint_complex_left_branch', 'blimp_coordinate_structure_constraint_object_extraction', 'blimp_determiner_noun_agreement_1', 'blimp_determiner_noun_agreement_2', 'blimp_determiner_noun_agreement_irregular_1', 'blimp_determiner_noun_agreement_irregular_2', 'blimp_determiner_noun_agreement_with_adj_2', 'blimp_determiner_noun_agreement_with_adj_irregular_1', 'blimp_determiner_noun_agreement_with_adj_irregular_2', 'blimp_determiner_noun_agreement_with_adjective_1', 'blimp_distractor_agreement_relational_noun', 'blimp_distractor_agreement_relative_clause', 'blimp_drop_argument', 'blimp_ellipsis_n_bar_1', 'blimp_ellipsis_n_bar_2', 'blimp_existential_there_object_raising', 'blimp_existential_there_quantifiers_1', 'blimp_existential_there_quantifiers_2', 'blimp_existential_there_subject_raising', 'blimp_expletive_it_object_raising', 'blimp_inchoative', 'blimp_intransitive', 'blimp_irregular_past_participle_adjectives', 'blimp_irregular_past_participle_verbs', 'blimp_irregular_plural_subject_verb_agreement_1', 'blimp_irregular_plural_subject_verb_agreement_2', 'blimp_left_branch_island_echo_question', 'blimp_left_branch_island_simple_question', 'blimp_matrix_question_npi_licensor_present', 'blimp_npi_present_1', 'blimp_npi_present_2', 'blimp_only_npi_licensor_present', 'blimp_only_npi_scope', 'blimp_passive_1', 'blimp_passive_2', 'blimp_principle_A_c_command', 'blimp_principle_A_case_1', 'blimp_principle_A_case_2', 'blimp_principle_A_domain_1', 'blimp_principle_A_domain_2', 'blimp_principle_A_domain_3', 'blimp_principle_A_reconstruction', 'blimp_regular_plural_subject_verb_agreement_1', 'blimp_regular_plural_subject_verb_agreement_2', 'blimp_sentential_negation_npi_licensor_present', 'blimp_sentential_negation_npi_scope', 'blimp_sentential_subject_island', 'blimp_superlative_quantifiers_1', 'blimp_superlative_quantifiers_2', 'blimp_tough_vs_raising_1', 'blimp_tough_vs_raising_2', 'blimp_transitive', 'blimp_wh_island', 'blimp_wh_questions_object_gap', 'blimp_wh_questions_subject_gap', 'blimp_wh_questions_subject_gap_long_distance', 'blimp_wh_vs_that_no_gap', 'blimp_wh_vs_that_no_gap_long_distance', 'blimp_wh_vs_that_with_gap', 'blimp_wh_vs_that_with_gap_long_distance', 'boolq', 'cb', 'cola', 'copa', 'coqa', 'crows_pairs_english', 'crows_pairs_english_age', 'crows_pairs_english_autre', 'crows_pairs_english_disability', 'crows_pairs_english_gender', 'crows_pairs_english_nationality', 'crows_pairs_english_physical_appearance', 'crows_pairs_english_race_color', 'crows_pairs_english_religion', 'crows_pairs_english_sexual_orientation', 'crows_pairs_english_socioeconomic', 'crows_pairs_french', 'crows_pairs_french_age', 'crows_pairs_french_autre', 'crows_pairs_french_disability', 'crows_pairs_french_gender', 'crows_pairs_french_nationality', 'crows_pairs_french_physical_appearance', 'crows_pairs_french_race_color', 'crows_pairs_french_religion', 'crows_pairs_french_sexual_orientation', 'crows_pairs_french_socioeconomic', 'csatqa_gr', 'csatqa_li', 'csatqa_rch', 'csatqa_rcs', 'csatqa_rcss', 'csatqa_wr', 'cycle_letters', 'drop', 'ethics_cm', 'ethics_deontology', 'ethics_justice', 'ethics_utilitarianism', 'ethics_utilitarianism_original', 'ethics_virtue', 'gsm8k', 'headqa', 'headqa_en', 'headqa_es', 'hellaswag', 'hendrycksTest-abstract_algebra', 'hendrycksTest-anatomy', 'hendrycksTest-astronomy', 'hendrycksTest-business_ethics', 'hendrycksTest-clinical_knowledge', 'hendrycksTest-college_biology', 'hendrycksTest-college_chemistry', 'hendrycksTest-college_computer_science', 'hendrycksTest-college_mathematics', 'hendrycksTest-college_medicine', 'hendrycksTest-college_physics', 'hendrycksTest-computer_security', 'hendrycksTest-conceptual_physics', 'hendrycksTest-econometrics', 'hendrycksTest-electrical_engineering', 'hendrycksTest-elementary_mathematics', 'hendrycksTest-formal_logic', 'hendrycksTest-global_facts', 'hendrycksTest-high_school_biology', 'hendrycksTest-high_school_chemistry', 'hendrycksTest-high_school_computer_science', 'hendrycksTest-high_school_european_history', 'hendrycksTest-high_school_geography', 'hendrycksTest-high_school_government_and_politics', 'hendrycksTest-high_school_macroeconomics', 'hendrycksTest-high_school_mathematics', 'hendrycksTest-high_school_microeconomics', 'hendrycksTest-high_school_physics', 'hendrycksTest-high_school_psychology', 'hendrycksTest-high_school_statistics', 'hendrycksTest-high_school_us_history', 'hendrycksTest-high_school_world_history', 'hendrycksTest-human_aging', 'hendrycksTest-human_sexuality', 'hendrycksTest-international_law', 'hendrycksTest-jurisprudence', 'hendrycksTest-logical_fallacies', 'hendrycksTest-machine_learning', 'hendrycksTest-management', 'hendrycksTest-marketing', 'hendrycksTest-medical_genetics', 'hendrycksTest-miscellaneous', 'hendrycksTest-moral_disputes', 'hendrycksTest-moral_scenarios', 'hendrycksTest-nutrition', 'hendrycksTest-philosophy', 'hendrycksTest-prehistory', 'hendrycksTest-professional_accounting', 'hendrycksTest-professional_law', 'hendrycksTest-professional_medicine', 'hendrycksTest-professional_psychology', 'hendrycksTest-public_relations', 'hendrycksTest-security_studies', 'hendrycksTest-sociology', 'hendrycksTest-us_foreign_policy', 'hendrycksTest-virology', 'hendrycksTest-world_religions', 'iwslt17-ar-en', 'iwslt17-en-ar', 'lambada_openai', 'lambada_openai_cloze', 'lambada_openai_mt_de', 'lambada_openai_mt_en', 'lambada_openai_mt_es', 'lambada_openai_mt_fr', 'lambada_openai_mt_it', 'lambada_standard', 'lambada_standard_cloze', 'logiqa', 'math_algebra', 'math_asdiv', 'math_counting_and_prob', 'math_geometry', 'math_intermediate_algebra', 'math_num_theory', 'math_prealgebra', 'math_precalc', 'mathqa', 'mc_taco', 'mgsm_bn', 'mgsm_de', 'mgsm_en', 'mgsm_es', 'mgsm_fr', 'mgsm_ja', 'mgsm_ru', 'mgsm_sw', 'mgsm_te', 'mgsm_th', 'mgsm_zh', 'mnli', 'mnli_mismatched', 'mrpc', 'multirc', 'mutual', 'mutual_plus', 'openbookqa', 'pawsx_de', 'pawsx_en', 'pawsx_es', 'pawsx_fr', 'pawsx_ja', 'pawsx_ko', 'pawsx_zh', 'pile_arxiv', 'pile_bookcorpus2', 'pile_books3', 'pile_dm-mathematics', 'pile_enron', 'pile_europarl', 'pile_freelaw', 'pile_github', 'pile_gutenberg', 'pile_hackernews', 'pile_nih-exporter', 'pile_opensubtitles', 'pile_openwebtext2', 'pile_philpapers', 'pile_pile-cc', 'pile_pubmed-abstracts', 'pile_pubmed-central', 'pile_stackexchange', 'pile_ubuntu-irc', 'pile_uspto', 'pile_wikipedia', 'pile_youtubesubtitles', 'piqa', 'prost', 'pubmedqa', 'qa4mre_2011', 'qa4mre_2012', 'qa4mre_2013', 'qasper', 'qnli', 'qqp', 'race', 'random_insertion', 'record', 'reversed_words', 'rte', 'sciq', 'scrolls_contractnli', 'scrolls_govreport', 'scrolls_narrativeqa', 'scrolls_qasper', 'scrolls_qmsum', 'scrolls_quality', 'scrolls_summscreenfd', 'squad2', 'sst', 'swag', 'toxigen', 'triviaqa', 'truthfulqa_gen', 'truthfulqa_mc', 'webqs', 'wic', 'wikitext', 'winogrande', 'wmt14-en-fr', 'wmt14-fr-en', 'wmt16-de-en', 'wmt16-en-de', 'wmt16-en-ro', 'wmt16-ro-en', 'wmt20-cs-en', 'wmt20-de-en', 'wmt20-de-fr', 'wmt20-en-cs', 'wmt20-en-de', 'wmt20-en-iu', 'wmt20-en-ja', 'wmt20-en-km', 'wmt20-en-pl', 'wmt20-en-ps', 'wmt20-en-ru', 'wmt20-en-ta', 'wmt20-en-zh', 'wmt20-fr-de', 'wmt20-iu-en', 'wmt20-ja-en', 'wmt20-km-en', 'wmt20-pl-en', 'wmt20-ps-en', 'wmt20-ru-en', 'wmt20-ta-en', 'wmt20-zh-en', 'wnli', 'wsc', 'wsc273', 'xcopa_et', 'xcopa_ht', 'xcopa_id', 'xcopa_it', 'xcopa_qu', 'xcopa_sw', 'xcopa_ta', 'xcopa_th', 'xcopa_tr', 'xcopa_vi', 'xcopa_zh', 'xnli_ar', 'xnli_bg', 'xnli_de', 'xnli_el', 'xnli_en', 'xnli_es', 'xnli_fr', 'xnli_hi', 'xnli_ru', 'xnli_sw', 'xnli_th', 'xnli_tr', 'xnli_ur', 'xnli_vi', 'xnli_zh', 'xstory_cloze_ar', 'xstory_cloze_en', 'xstory_cloze_es', 'xstory_cloze_eu', 'xstory_cloze_hi', 'xstory_cloze_id', 'xstory_cloze_my', 'xstory_cloze_ru', 'xstory_cloze_sw', 'xstory_cloze_te', 'xstory_cloze_zh', 'xwinograd_en', 'xwinograd_fr', 'xwinograd_jp', 'xwinograd_pt', 'xwinograd_ru', 'xwinograd_zh')

haileyschoelkopf commented 1 year ago

Hi, sorry you're experiencing issues!

I'm unable to replicate this on my side. It seems like you might just have an extra unescaped newline somewhere causing none of the arguments to be passed to main.py.

If this issue persists and you have a different command to replicate it, feel free to post it here and I'll take a closer look.