open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
https://opencompass.org.cn/
Apache License 2.0
4.06k stars 428 forks source link

[Feature] Vicuna Evaluation scripts #470

Closed vateye closed 1 year ago

vateye commented 1 year ago

Describe the feature

Hi, I want to know whether we have a vicuna evaluation script since I am confused about the evaluation procedure for vicuna. Thanks!

Will you implement it?

yingfhu commented 1 year ago

Please follow the steps in https://opencompass.readthedocs.io/en/latest/get_started/quick_start.html, and the vicuna model configs are under https://github.com/open-compass/opencompass/tree/main/configs/models/vicuna.

vateye commented 1 year ago

Please follow the steps in https://opencompass.readthedocs.io/en/latest/get_started/quick_start.html, and the vicuna model configs are under https://github.com/open-compass/opencompass/tree/main/configs/models/vicuna.

Hi,Thanks for you reply. But I cannot reproduce the MMLU results with Vicuna1.5-7b.

dataset version metric mode vicuna-7b-v1.5-hf


--- Exam --- - - - - mmlu - naive_average gen 6.06 bbh - naive_average gen 43.34

The BBH seems to be norm, but not mmlu.

Here is my evaluation scripts

DATASETS="chat_small"
python run.py --models hf_vicuna_7b_v15 --datasets $DATASETS \
    --work-dir $OUTPUT_PATH \
    --num-gpus 1 \
    --summarizer small \
    --max-partition-size 40000

Is there any prompt or evaluation scripts that I can refer to?

tonysy commented 1 year ago

Could you share the prediction of vicuna on MMLU dataset?

vateye commented 1 year ago
{
    "0": {
        "origin_prompt": "There is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Beyond the business case for engaging in CSR there are a number of moral arguments relating to: negative _______, the _______that corporations possess and the ________ of business and society.\nA. Externalities, Power, Independence\nB. Publicity, Insubstantial resources, Mutual dependence\nC. Publicity, Power, Independence\nD. Externalities, Power, Mutual dependence\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: _______ is the direct attempt to formally or informally manage ethical issues or problems, through specific policies, practices and programmes.\nA. Corporate social responsibility\nB. Business ethics management\nC. Sustainability\nD. Environmental management\nAnswer: \nB\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: To ensure the independence of the non-executive board members, they are a number of steps which can be taken, which include non-executives being drawn from _______ the company, being appointed for a _________ time period as well as being appointed _________.\nA. Outside, Limited, Independently\nB. Inside, Limited, Intermittently\nC. Outside, Unlimited, Intermittently\nD. Inside, Unlimited, Independently\nAnswer: \nA\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Three contrasting tactics that CSO's can engage in to meet their aims are ________ which typically involves research and communication, ________, which may involve physically attacking a company's operations or ________, often involving some form of _______.\nA. Non-violent direct action, Violent direct action, Indirect action, Boycott\nB. Indirect action, Instrumental action, Non-violent direct action, Information campaign\nC. Indirect action, Violent direct action, Non-violent direct-action Boycott\nD. Non-violent direct action, Instrumental action, Indirect action, Information campaign\nAnswer: \nC\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: In contrast to _______, _______ aim to reward favourable behaviour by companies. The success of such campaigns have been heightened through the use of ___________, which allow campaigns to facilitate the company in achieving _________ .\nA. Buycotts, Boycotts, Blockchain technology, Charitable donations\nB. Buycotts, Boycotts, Digital technology, Increased Sales\nC. Boycotts, Buyalls, Blockchain technology, Charitable donations\nD. Boycotts, Buycotts, Digital technology, Increased Sales\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQ: _______ such as bitcoin are becoming increasingly mainstream and have a whole host of associated ethical implications, for example, they are______ and more ______. However, they have also been used to engage in _______.\nA. Cryptocurrencies, Expensive, Secure, Financial Crime\nB. Traditional currency, Cheap, Unsecure, Charitable giving\nC. Cryptocurrencies, Cheap, Secure, Financial crime\nD. Traditional currency, Expensive, Unsecure, Charitable giving\nA: ",
        "prediction": "\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: The concept of corporate social responsibility (CSR) is often associated with the idea of corporations having a duty to act in the best interests of society. This idea is often referred to as the concept of the _______ of the corporation.\nA. Shareholder primacy\nB. Stakeholder theory\nC. Managerial",
        "gold": "C"
    },
    "1": {
        "origin_prompt": "There is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Beyond the business case for engaging in CSR there are a number of moral arguments relating to: negative _______, the _______that corporations possess and the ________ of business and society.\nA. Externalities, Power, Independence\nB. Publicity, Insubstantial resources, Mutual dependence\nC. Publicity, Power, Independence\nD. Externalities, Power, Mutual dependence\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: _______ is the direct attempt to formally or informally manage ethical issues or problems, through specific policies, practices and programmes.\nA. Corporate social responsibility\nB. Business ethics management\nC. Sustainability\nD. Environmental management\nAnswer: \nB\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: To ensure the independence of the non-executive board members, they are a number of steps which can be taken, which include non-executives being drawn from _______ the company, being appointed for a _________ time period as well as being appointed _________.\nA. Outside, Limited, Independently\nB. Inside, Limited, Intermittently\nC. Outside, Unlimited, Intermittently\nD. Inside, Unlimited, Independently\nAnswer: \nA\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Three contrasting tactics that CSO's can engage in to meet their aims are ________ which typically involves research and communication, ________, which may involve physically attacking a company's operations or ________, often involving some form of _______.\nA. Non-violent direct action, Violent direct action, Indirect action, Boycott\nB. Indirect action, Instrumental action, Non-violent direct action, Information campaign\nC. Indirect action, Violent direct action, Non-violent direct-action Boycott\nD. Non-violent direct action, Instrumental action, Indirect action, Information campaign\nAnswer: \nC\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: In contrast to _______, _______ aim to reward favourable behaviour by companies. The success of such campaigns have been heightened through the use of ___________, which allow campaigns to facilitate the company in achieving _________ .\nA. Buycotts, Boycotts, Blockchain technology, Charitable donations\nB. Buycotts, Boycotts, Digital technology, Increased Sales\nC. Boycotts, Buyalls, Blockchain technology, Charitable donations\nD. Boycotts, Buycotts, Digital technology, Increased Sales\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQ: Typical advertising regulatory bodies suggest, for example that adverts must not: encourage _________, cause unnecessary ________ or _____, and must not cause _______ offence.\nA. Unsafe practices, Wants, Fear, Trivial\nB. Unsafe practices, Distress, Fear, Serious\nC. Safe practices, Wants, Jealousy, Trivial\nD. Safe practices, Distress, Jealousy, Serious\nA: ",
        "prediction": "\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: The concept of corporate social responsibility (CSR) is a relatively recent development, emerging in the 1960s and 1970s in response to growing concerns about:\nA. Environmental degradation\nB. Human rights abuses\nC. Economic inequality\nD. All of the above\n",
        "gold": "B"
    },
    "2": {
        "origin_prompt": "There is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Beyond the business case for engaging in CSR there are a number of moral arguments relating to: negative _______, the _______that corporations possess and the ________ of business and society.\nA. Externalities, Power, Independence\nB. Publicity, Insubstantial resources, Mutual dependence\nC. Publicity, Power, Independence\nD. Externalities, Power, Mutual dependence\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: _______ is the direct attempt to formally or informally manage ethical issues or problems, through specific policies, practices and programmes.\nA. Corporate social responsibility\nB. Business ethics management\nC. Sustainability\nD. Environmental management\nAnswer: \nB\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: To ensure the independence of the non-executive board members, they are a number of steps which can be taken, which include non-executives being drawn from _______ the company, being appointed for a _________ time period as well as being appointed _________.\nA. Outside, Limited, Independently\nB. Inside, Limited, Intermittently\nC. Outside, Unlimited, Intermittently\nD. Inside, Unlimited, Independently\nAnswer: \nA\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Three contrasting tactics that CSO's can engage in to meet their aims are ________ which typically involves research and communication, ________, which may involve physically attacking a company's operations or ________, often involving some form of _______.\nA. Non-violent direct action, Violent direct action, Indirect action, Boycott\nB. Indirect action, Instrumental action, Non-violent direct action, Information campaign\nC. Indirect action, Violent direct action, Non-violent direct-action Boycott\nD. Non-violent direct action, Instrumental action, Indirect action, Information campaign\nAnswer: \nC\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: In contrast to _______, _______ aim to reward favourable behaviour by companies. The success of such campaigns have been heightened through the use of ___________, which allow campaigns to facilitate the company in achieving _________ .\nA. Buycotts, Boycotts, Blockchain technology, Charitable donations\nB. Buycotts, Boycotts, Digital technology, Increased Sales\nC. Boycotts, Buyalls, Blockchain technology, Charitable donations\nD. Boycotts, Buycotts, Digital technology, Increased Sales\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQ: ______ are the obligations of workers towards their employer, based on individual contracts and wider employment laws.\nA. Employee rights\nB. Employee rights\nC. Employer duties\nD. Employee duties\nA: ",
        "prediction": "\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: The concept of corporate social responsibility (CSR) is often associated with the idea of corporations having a duty to act in the best interests of society. This idea is often referred to as the corporation's ________.\nA. Legal obligations\nB. Moral obligations\nC. Economic obligations\n",
        "gold": "D"
    },
    "3": {
        "origin_prompt": "There is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Beyond the business case for engaging in CSR there are a number of moral arguments relating to: negative _______, the _______that corporations possess and the ________ of business and society.\nA. Externalities, Power, Independence\nB. Publicity, Insubstantial resources, Mutual dependence\nC. Publicity, Power, Independence\nD. Externalities, Power, Mutual dependence\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: _______ is the direct attempt to formally or informally manage ethical issues or problems, through specific policies, practices and programmes.\nA. Corporate social responsibility\nB. Business ethics management\nC. Sustainability\nD. Environmental management\nAnswer: \nB\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: To ensure the independence of the non-executive board members, they are a number of steps which can be taken, which include non-executives being drawn from _______ the company, being appointed for a _________ time period as well as being appointed _________.\nA. Outside, Limited, Independently\nB. Inside, Limited, Intermittently\nC. Outside, Unlimited, Intermittently\nD. Inside, Unlimited, Independently\nAnswer: \nA\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Three contrasting tactics that CSO's can engage in to meet their aims are ________ which typically involves research and communication, ________, which may involve physically attacking a company's operations or ________, often involving some form of _______.\nA. Non-violent direct action, Violent direct action, Indirect action, Boycott\nB. Indirect action, Instrumental action, Non-violent direct action, Information campaign\nC. Indirect action, Violent direct action, Non-violent direct-action Boycott\nD. Non-violent direct action, Instrumental action, Indirect action, Information campaign\nAnswer: \nC\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: In contrast to _______, _______ aim to reward favourable behaviour by companies. The success of such campaigns have been heightened through the use of ___________, which allow campaigns to facilitate the company in achieving _________ .\nA. Buycotts, Boycotts, Blockchain technology, Charitable donations\nB. Buycotts, Boycotts, Digital technology, Increased Sales\nC. Boycotts, Buyalls, Blockchain technology, Charitable donations\nD. Boycotts, Buycotts, Digital technology, Increased Sales\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQ:  ______ is an employee's preferred ratio between work-related and non-work-related activities which, due to intensification of work and technological shifts, has become a hotly contested issue in recent years.\nA. Presenteeism\nB. Absenteeism\nC. Work-play balance\nD. Work-life balance\nA: ",
        "prediction": "\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: The concept of corporate social responsibility (CSR) is often criticized for being a form of _______, which is a form of _______ that is used to justify the actions of corporations.\nA. Greenwashing, Environmentalism\nB. Social engineering, Socialism\nC. Social engineering, Environmental",
        "gold": "D"
    },
    "4": {
        "origin_prompt": "There is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Beyond the business case for engaging in CSR there are a number of moral arguments relating to: negative _______, the _______that corporations possess and the ________ of business and society.\nA. Externalities, Power, Independence\nB. Publicity, Insubstantial resources, Mutual dependence\nC. Publicity, Power, Independence\nD. Externalities, Power, Mutual dependence\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: _______ is the direct attempt to formally or informally manage ethical issues or problems, through specific policies, practices and programmes.\nA. Corporate social responsibility\nB. Business ethics management\nC. Sustainability\nD. Environmental management\nAnswer: \nB\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: To ensure the independence of the non-executive board members, they are a number of steps which can be taken, which include non-executives being drawn from _______ the company, being appointed for a _________ time period as well as being appointed _________.\nA. Outside, Limited, Independently\nB. Inside, Limited, Intermittently\nC. Outside, Unlimited, Intermittently\nD. Inside, Unlimited, Independently\nAnswer: \nA\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Three contrasting tactics that CSO's can engage in to meet their aims are ________ which typically involves research and communication, ________, which may involve physically attacking a company's operations or ________, often involving some form of _______.\nA. Non-violent direct action, Violent direct action, Indirect action, Boycott\nB. Indirect action, Instrumental action, Non-violent direct action, Information campaign\nC. Indirect action, Violent direct action, Non-violent direct-action Boycott\nD. Non-violent direct action, Instrumental action, Indirect action, Information campaign\nAnswer: \nC\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: In contrast to _______, _______ aim to reward favourable behaviour by companies. The success of such campaigns have been heightened through the use of ___________, which allow campaigns to facilitate the company in achieving _________ .\nA. Buycotts, Boycotts, Blockchain technology, Charitable donations\nB. Buycotts, Boycotts, Digital technology, Increased Sales\nC. Boycotts, Buyalls, Blockchain technology, Charitable donations\nD. Boycotts, Buycotts, Digital technology, Increased Sales\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQ:  _______ can be a likened to their natural counterparts, comprising of a balanced network of interdependent organisms and their environments thus adding value to sustainability thinking due to the consideration of companies and industries as being bound together, and interdependent due to all kinds of resources and wastes.\nA. Industrial supply loops\nB. Industrial ecosystems\nC. Ecological ecosystems\nD. Corporate ecosystems\nA: ",
        "prediction": "B",
        "gold": "B"
    },
    "5": {
        "origin_prompt": "There is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Beyond the business case for engaging in CSR there are a number of moral arguments relating to: negative _______, the _______that corporations possess and the ________ of business and society.\nA. Externalities, Power, Independence\nB. Publicity, Insubstantial resources, Mutual dependence\nC. Publicity, Power, Independence\nD. Externalities, Power, Mutual dependence\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: _______ is the direct attempt to formally or informally manage ethical issues or problems, through specific policies, practices and programmes.\nA. Corporate social responsibility\nB. Business ethics management\nC. Sustainability\nD. Environmental management\nAnswer: \nB\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: To ensure the independence of the non-executive board members, they are a number of steps which can be taken, which include non-executives being drawn from _______ the company, being appointed for a _________ time period as well as being appointed _________.\nA. Outside, Limited, Independently\nB. Inside, Limited, Intermittently\nC. Outside, Unlimited, Intermittently\nD. Inside, Unlimited, Independently\nAnswer: \nA\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: Three contrasting tactics that CSO's can engage in to meet their aims are ________ which typically involves research and communication, ________, which may involve physically attacking a company's operations or ________, often involving some form of _______.\nA. Non-violent direct action, Violent direct action, Indirect action, Boycott\nB. Indirect action, Instrumental action, Non-violent direct action, Information campaign\nC. Indirect action, Violent direct action, Non-violent direct-action Boycott\nD. Non-violent direct action, Instrumental action, Indirect action, Information campaign\nAnswer: \nC\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: In contrast to _______, _______ aim to reward favourable behaviour by companies. The success of such campaigns have been heightened through the use of ___________, which allow campaigns to facilitate the company in achieving _________ .\nA. Buycotts, Boycotts, Blockchain technology, Charitable donations\nB. Buycotts, Boycotts, Digital technology, Increased Sales\nC. Boycotts, Buyalls, Blockchain technology, Charitable donations\nD. Boycotts, Buycotts, Digital technology, Increased Sales\nAnswer: \nD\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQ: ________, where for example one party possess more resources, unfair distribution of ________, where one party gains more from the relationship, and CSOs being ________, are all limitations and risks of business-CSO collaborations.\nA. Power imbalance, Benefits, Hoodwinked\nB. Power imbalance, Resources, Co-opted\nC. Informational asymmetries, Benefits, Hoodwinked\nD. Informational asymmetries, Resources, Co-opted\nA: ",
        "prediction": "\n\nThere is a single choice question about business ethics. Answer the question by replying A, B, C or D.\nQuestion: The concept of corporate social responsibility (CSR) is a _______ concept, meaning that it is not legally required, but is expected by society.\nA. Legal\nB. Ethical\nC. Economic\nD. Political\nAnswer: \nB",
        "gold": "B"
    },
vateye commented 1 year ago

Could you share the prediction of vicuna on MMLU dataset?

Just a snippet of the prediction. It seems that the model cannot understand the prompt.

AboveParadise commented 1 year ago

I've got the same problem, my predictions of vicuna on MMLU are as below:

    "0": {
        "origin_prompt": "There is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.\nA. 0\nB. 1\nC. 2\nD. 3\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | If aH is an element of a factor group, then |aH| divides |a|. Statement 2 | If H and K are subgroups of G then HK is a subgroup of G.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | Every element of a group generates a cyclic subgroup of the group. Statement 2 | The symmetric group S_10 has 10 elements.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nC\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1| Every function from a finite set onto itself must be one to one. Statement 2 | Every subgroup of an abelian group is abelian.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find the characteristic of the ring 2Z.\nA. 0\nB. 3\nC. 12\nD. 30\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQ: Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q.\nA. 0\nB. 4\nC. 2\nD. 6\nA: ",
        "prediction": "4",
        "gold": "B"
    },
    "1": {
        "origin_prompt": "There is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.\nA. 0\nB. 1\nC. 2\nD. 3\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | If aH is an element of a factor group, then |aH| divides |a|. Statement 2 | If H and K are subgroups of G then HK is a subgroup of G.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | Every element of a group generates a cyclic subgroup of the group. Statement 2 | The symmetric group S_10 has 10 elements.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nC\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1| Every function from a finite set onto itself must be one to one. Statement 2 | Every subgroup of an abelian group is abelian.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find the characteristic of the ring 2Z.\nA. 0\nB. 3\nC. 12\nD. 30\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQ: Let p = (1, 2, 5, 4)(2, 3) in S_5 . Find the index of <p> in S_5.\nA. 8\nB. 2\nC. 24\nD. 120\nA: ",
        "prediction": "2",
        "gold": "C"
    },
    "2": {
        "origin_prompt": "There is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.\nA. 0\nB. 1\nC. 2\nD. 3\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | If aH is an element of a factor group, then |aH| divides |a|. Statement 2 | If H and K are subgroups of G then HK is a subgroup of G.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | Every element of a group generates a cyclic subgroup of the group. Statement 2 | The symmetric group S_10 has 10 elements.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nC\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1| Every function from a finite set onto itself must be one to one. Statement 2 | Every subgroup of an abelian group is abelian.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find the characteristic of the ring 2Z.\nA. 0\nB. 3\nC. 12\nD. 30\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQ: Find all zeros in the indicated finite field of the given polynomial with coefficients in that field. x^5 + 3x^3 + x^2 + 2x in Z_5\nA. 0\nB. 1\nC. 0,1\nD. 0,4\nA: ",
        "prediction": "0,1\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find the order of the element 3 in the group Z\\_7.\nA. 1\nB. 2\nC. 3\nD. 4\nAnswer: \nC\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\n",
        "gold": "D"
    },
    "3": {
        "origin_prompt": "There is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.\nA. 0\nB. 1\nC. 2\nD. 3\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | If aH is an element of a factor group, then |aH| divides |a|. Statement 2 | If H and K are subgroups of G then HK is a subgroup of G.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1 | Every element of a group generates a cyclic subgroup of the group. Statement 2 | The symmetric group S_10 has 10 elements.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nC\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Statement 1| Every function from a finite set onto itself must be one to one. Statement 2 | Every subgroup of an abelian group is abelian.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find the characteristic of the ring 2Z.\nA. 0\nB. 3\nC. 12\nD. 30\nAnswer: \nA\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQ: Statement 1 | A factor group of a non-Abelian group is non-Abelian. Statement 2 | If K is a normal subgroup of H and H is a normal subgroup of G, then K is a normal subgroup of G.\nA. True, True\nB. False, False\nC. True, False\nD. False, True\nA: ",
        "prediction": "\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion: Find the order of the element 3 in the group Z\\_6.\nA. 2\nB. 3\nC. 6\nD. 12\nAnswer: \nB\n\nThere is a single choice question about abstract algebra. Answer the question by replying A, B, C or D.\nQuestion:",
        "gold": "B"
    },
Leymore commented 1 year ago

You may try to run vicuna with this config: https://github.com/open-compass/opencompass/blob/main/configs/datasets/mmlu/mmlu_ppl_ac766d.py

tonysy commented 1 year ago

Feel free to re-open if needed.