liyongsea / parallel_corpus_mnbvc

parallel corpus dataset from the mnbvc project
Apache License 2.0
7 stars 5 forks source link

Error investigation #30

Closed liyongsea closed 1 year ago

liyongsea commented 1 year ago

Investigate the performance of GptOfflineDetector

voidf commented 1 year ago

目前一个输入样例,无法用现有的prompt给出分段的(GPT疯狂道歉):

{"time": "2023-06-01 19:18:10.363074", "rec": "541713", "batch": 29, "step": 1400, "input": "(b) Contribution to the implementation of the Framework Convention on Climate Change through\nencouraging policy dialogue, participating in inter-agency expert groups and meetings, namely, the\nVulnerability and Adaptation Resource Group, Consultative Group of Experts on national\ncommunications from non-Annex I Parties to the Framework Convention on Climate Change, and\nresponding to requests by Parties related to the conferences of the parties and related bodies (one\nexercise), (GC.22/3, GC.22/9), (external: other United Nations organizations, national Governments and\nvarious donors), (internal: Division of Environmental Conventions, Division of GEF Coordination),\n(multilateral environmental agreement link: to all multilateral environmental agreements) **\n(c) Organize inter-agency meetings (UNDP-UNEP-World Bank-WHO-FAO) on the poverty-environment\nnexus to plan further cooperative activities (two meetings), (GC.22/9, GC.22/10), (external: UNDP,\nWorld Bank, WHO, FAO, others) **\nOutputs planned for the biennium in pursuit of expected accomplishment (c)\nTechnical cooperation\nAdvisory services:\n(a) Organize meetings in conjunction with regional civil society forums to increase capacity in the field of\nenvironment of civil society organizations in the South and civil society organizations from countries\nwith economies in transition (10 meetings), (GCSS.VII/5, GC.22/14, GC.22/18), (internal: all divisions),\n(multilateral environmental agreement link: to all multilateral environmental agreements) ***\n(b) Provide substantive input to subregional, regional and global environmental intergovernmental meetings\nand processes, including UNEP-organized meetings on mainstreaming of gender issues into international\nenvironmental policies and processes (inputs to four meetings), (GC.18/6, GC.20/9), (internal: Division\nof Early Warning and Assessment, Division of Regional Cooperation, Division of Communications and\nPublic Information), (multilateral environmental agreement link: to all multilateral environmental\nagreements) ***\nGroup training such as workshops, seminars and symposiums, etc:\n(c) Organize the annual global civil society forum in conjunction with the Governing Council/Global\nMinisterial Environment Forum to engage stakeholders in a substantive dialogue on UNEP activities (two\nmeetings), (Malm\u00f6 Declaration, GC.21/19, GCSS.VII/5), (internal: Division of Regional Cooperation,\nDivision of Environmental Conventions, Division of Communications and Public Information),\n(multilateral environmental agreement link: to all multilateral environmental agreements) ***\n(d) Organize relevant partnership campaigns, including with the private sector and support policy dialogues\nto provide technical support to civil society in articulating linkages between environment and poverty\nalleviation (10 events), (GCSS.VI/1, GCSS.VII/5, GC.21/18), (internal: Division of Early Warning and\nAssessment, Division of Environmental Policy Implementation, Division of Regional Cooperation,\nDivision of Communications and Public Information), (multilateral environmental agreement link: to all\nmultilateral environmental agreements) **\n(e) Organize an expert group meeting and develop partnerships and collaborative arrangements to increase\nthe understanding among United Nations agencies, multilateral environmental agreements and civil\nsociety of the linkages between the environment and human society, pending the decision of the\nGoverning Council at its twenty-third session on the follow-up to decision GC.22/16 related to\nenvironment and cultural diversity (one event), ( GC.21/18), (internal: Division of Early Warning and\nAssessment, Division of Environmental Policy Implementation, Division of Regional Cooperation),\n(external: United Nations Educational, Scientific and Cultural Organization (UNESCO), multilateral\nenvironmental agreements, Permanent Forum on Indigenous Issues, World Intellectual Property\nOrganization (WIPO), FAO)\n9\nOutputs listed under this category reflect the catalytic work by UNEP. They are presented here for\nconsideration by the Governing Council/Global Ministerial Environment Forum. In the preparation of the\nUNEP budget fascicle, however, these outputs will be reflected in an aggregated form as part of the narrative\nof the introduction.\nOutputs planned for the biennium in pursuit of expected accomplishment (d)\nServicing of intergovernmental and expert bodies\nParliamentary documentation:\n(a) Reports on the outcomes of meetings of the Environmental Management Group and its issue management\ngroups to the Governing Council/Global Ministerial Environment Forum and the United Nations General\nAssembly, (two reports), (GA 53/242, GA 58/209, GCSS.VIII/1 VI), (internal: secretariat of the\nGoverning Council), (external: members of the Environmental Management Group)\n(b) Preparation for and substantive reporting to the UNEP Governing Council, particularly in the area of\npolicy development and coordination, (one document), (GA 2997 (XXVII)), (internal: all divisions as\nnecessary), (external: Governments)\nOther substantive activities\nBooklets, pamphlets, fact-sheets, information kits:\n(a) Development of products to promote cross-sectoral linkages between the environment and other areas\nsuch as health, children and the inter-faith community (one set), (Johannesburg Plan of Implementation 7\n(f), 53, 56, 57, GC.22/4, GC.22/20), (internal: all divisions as necessary), (external: WHO, UNICEF,\nother United Nations partners, Governments, non-governmental organizations, academic institutions,\nprivate sector and religious groups active in the issue areas) ***\nSpecial events:\n(b) Organization of special events, at United Nations Headquarters and externally with major groups,\nproducing greater visibility, public awareness and support for the mission of UNEP and environmental\nissues in general, (one event), (GA 2997 (XXVII), GA 58/158, GCSS.VI/1, GCSS.VII/5), (internal: all\ndivisions as necessary), (external: Governments, major groups) ***\nExhibits, guided tours and lectures:\n(c) Represent the UNEP Executive Director in briefings and provide policy advice and information materials\nto permanent missions to United Nations Headquarters and their delegations to the General Assembly and\nthe United Nations Economic and Social Council and its subsidiary bodies, (one exercise as required),\n(GA 2997 (XXVII)), (internal: all divisions as necessary), (external: Governments) ***\n(d) Regular briefings and information materials to major groups on the mission of UNEP and environmental", "exc": "unknown", "msg": "('max() arg is an empty sequence',)"}

直接读这段json的input字段就可以

voidf commented 1 year ago

Thomas的新prompt。 构造:

def generate_prompt2(input_content: str):
    return [
        {'role': 'system', 'content': 'You are an experienced English editor. '},
        {'role': 'user', 'content': '''I want you to check the following lines and group them into paragraphs. Please follow the instructions as below:

- Remove all the unnecessary linebreaks and merge the hyphenated words. 
- The original lines may be incomplete or mispelled, but DO NOT change any words of the original lines or add anything new. You must remember this.
- Return the paragraphs along with the confidence score (how you feel confident about your revision, 0~100).
- Return the result in JSON format, like {"paragraphs":["xxx","yyy"],"score":80}.
- Return the JSON only with nothing else, DO NOT explain anything.

### Original Lines:
''' + input_content},
    ]

输入:

{"time": "2023-06-07 13:51:01.711223", "rec": "3999256", "batch": 4, "step": 1400, "input": "Mali, which in 2011 succumbed to the violence that\noriginated in Libya, has never really recovered. It has in\nturn become an incubation centre for a form of terrorism\nthat is characteristic of a Sahel that is deeply affected\nby the effects of climate change preventing the practice\nof pastoral farming. That ecosystem of violence has\nhad a deadweight effect on the young shepherds of the\ncommunities most affected by climate change, among\nwhom a number of terrorist vocations have arisen. This\nscourge has spread from northern Mali to the Niger and\nBurkina Faso and is now attempting to spread to the\ncountries of the Gulf of Guinea. The violence has such\npotential to destabilize State institutions that it resulted\nin the downfall of the democratically elected regimes of\nMali and Burkina Faso in 2020 and 2022, respectively.\nThe Niger, my country, in addition to the terrorist\nhotbed known as the three-borders region \u2014 Mali,\nthe Niger and Burkina Faso \u2014 is also facing another\nterrorist hotbed in the Lake Chad basin, where the\nvarious groups that claim to be part of the shadowy\nBoko Haram operate.\nDespite the strong pressures related to the\ncircumstances I have just described, my country\nis showing great resilience centred on very wise\ngovernance, the promotion of a culture of tolerance\nand community cohesion, as well as the rules of\ndemocracy and the rule of law. It was that resolute\nchoice of rights and freedoms that made it possible\nto organize transparent elections that sanctioned the\nfirst change of Head of the State by virtue of which\na democratically elected President passed the baton\nto another democratically elected President in April\n2021. Need it be said that our experience proves that the\nsurest way to ward off the effects of terrorist violence is\nto strengthen the democratic regime and nothing else?\nThe war against terrorism now requires us to devote\nsignificant resources to increasing the numbers of our\nforces, acquiring adequate equipment and building\nthe capacities of our soldiers. Allow me to take this\nopportunity to thank the various partners engaged with\nus in the fight against terrorism. I would particularly\nlike to thank France, whose action through Operation\nBarkhane is a major asset in the struggle against our\nenemies. I also thank the United States of America,\nthe Federal Republic of Germany and all the other\ncountries committed alongside us in various ways, all\nequally valuable.\nThat being said, I find it crucial to point out that\nthe international community\u2019s commitment to the\nfight against terrorism in the Sahel has shortcomings\nthat need to be identified and addressed. Indeed, this\nterrorism derives a large part of its financial resources\nfrom trans-Saharan drug trafficking towards Europe\nand Asia, via Libya. Most of the weapons flooding the\nSahel terrorist violence market also come from Libya.\nI believe that we are faced with a problem that is not\nparticularly complicated. Why then has it not yet been\npossible to set up an adequate system with the necessary\nmeans to combat this phenomenon properly? It is indeed\ntime for collective reflection. It is time for the great\nPowers present in that region and the international\ncommunity as a whole to join the countries of our\nregion in a far more relevant reflection to define an\neffective course of action in the fight against drug and\narms trafficking in the Sahel.\nDespite the major challenges we face, as described\nearlier, the Niger intends to remain a solid and stable\nState, backed by democratic institutions, and resolutely\ncommitted to the fight against poverty and for\ndevelopment. Our deep conviction in that regard is that\nthe most effective means of combating poverty consists\nin promoting education. The development challenges\nfor the Niger are summarized by the following\nsimple statistics: an annual population growth rate\nof 3.9 per cent; an average synthetic fertility index\nof seven children per woman; a first birth for nearly\n50 per cent of girls before the age of 15; and 50 per cent\nof the population aged 15 on average. Those statistics\nsay much about the state of the education system in\nour country, and that is why we are committed to act\nwith determination to improve things in this area. As\nsuch, the agenda based on which the people of the Niger\nelected us aims to take action on two fronts: improving\naccess to and the quality of education.\nWith regard to quality, we have decided to place\nparticular emphasis on the quality of teacher training\nand the professionalization of teaching. With regard to\nimproving access, our policy emphasizes the building of\nschool infrastructure. The other particularly important\naspect of our educational programme that I wish to\nemphasize is aimed at achieving gender equity and\nbuilding residential facilities for girls in rural schools.\n7/50\nIndeed, because of the precarious living conditions of\nchildrenin rural schools, far from their parents, the\nlatter tend to prevent their daughters from continuing\ntheir studies. These adolescents are given in marriage\nas soon as they are out of the education system, which\nexplains some of the previously mentioned statistics.\nIn offering girls the tranquillity and security desired\nby their parents, residential facilities also offer girls\nconditions in which to continue their studies with\nthe possibility of staying in school and acquiring\nthe necessary academic and professional skills.\nThe large-scale expansion of residential facilities\nfor girls in rural schools contributes to improved\neducation-system performance and a reduction in the\nrate of population growth.\nMy reason for highlighting our ambitions in\nthe field of education for a country like the Niger,\nafter having spoken about climate change, terrorism\nand insecurity in the Sahel, is that we are aware that\nthese three issues \u2014 demography, climate change and\ninsecurity \u2014 are closely related. Indeed, the terrorism\ncurrently at work in the Sahel is connected to the living\nconditions of certain communities whose environment\nhas been considerably disrupted by climate change. The\nSahel is also an area that has experienced particularly\nhigh population growth in recent decades. The\ncombination of these two phenomena \u2014 demography\nand climate degradation \u2014 has created, as a result\nof the regional disorder following the fall of Colonel\nAl-Qaddafi\u2019s regime in 2011, the situation of chaos that\nthe countries of the Sahel are experiencing today.\nThat is why it is not enough for the United Nations\nand the international community to be moved by or\nto discuss terrorist violence at length. We must act\nby investing the necessary resources in education to\ncombat the violence of today and prevent the violence\nof tomorrow.\nThe Acting President: On behalf of the General", "exc": "unknown", "msg": "('Expecting value: line 1 column 1 (char 0)',)"}

几个输出样例均有前导非json内容,这里一一给出: 输出(样例1):

{"object":"chat.completion","created":1686117062,"model":"gpt-3.5-turbo-0301","usage":{"prompt_tokens":1550,"completion_tokens":1308,"total_tokens":2858},"choices":[{"message":{"role":"assistant","content":"Assembly, I would like to thank the President of the\nNiger for the statement he has just made.\n\n{\"paragraphs\":[\n\"Mali, which in 2011 succumbed to the violence that originated in Libya, has never really recovered. It has in turn become an incubation centre for a form of terrorism that is characteristic of a Sahel that is deeply affected by the effects of climate change preventing the practice of pastoral farming. That ecosystem of violence has had a deadweight effect on the young shepherds of the communities most affected by climate change, among whom a number of terrorist vocations have arisen. This scourge has spread from northern Mali to the Niger and Burkina Faso and is now attempting to spread to the countries of the Gulf of Guinea. The violence has such potential to destabilize State institutions that it resulted in the downfall of the democratically elected regimes of Mali and Burkina Faso in 2020 and 2022, respectively.\",\n\"The Niger, my country, in addition to the terrorist hotbed known as the three-borders region — Mali, the Niger and Burkina Faso — is also facing another terrorist hotbed in the Lake Chad basin, where the various groups that claim to be part of the shadowy Boko Haram operate.\",\n\"Despite the strong pressures related to the circumstances I have just described, my country is showing great resilience centred on very wise governance, the promotion of a culture of tolerance and community cohesion, as well as the rules of democracy and the rule of law. It was that resolute choice of rights and freedoms that made it possible to organize transparent elections that sanctioned the first change of Head of the State by virtue of which a democratically elected President passed the baton to another democratically elected President in April 2021. Need it be said that our experience proves that the surest way to ward off the effects of terrorist violence is to strengthen the democratic regime and nothing else?\",\n\"The war against terrorism now requires us to devote significant resources to increasing the numbers of our forces, acquiring adequate equipment and building the capacities of our soldiers. Allow me to take this opportunity to thank the various partners engaged with us in the fight against terrorism. I would particularly like to thank France, whose action through Operation Barkhane is a major asset in the struggle against our enemies. I also thank the United States of America, the Federal Republic of Germany and all the other countries committed alongside us in various ways, all equally valuable.\",\n\"That being said, I find it crucial to point out that the international community’s commitment to the fight against terrorism in the Sahel has shortcomings that need to be identified and addressed. Indeed, this terrorism derives a large part of its financial resources from trans-Saharan drug trafficking towards Europe and Asia, via Libya. Most of the weapons flooding the Sahel terrorist violence market also come from Libya. I believe that we are faced with a problem that is not particularly complicated. Why then has it not yet been possible to set up an adequate system with the necessary means to combat this phenomenon properly? It is indeed time for collective reflection. It is time for the great Powers present in that region and the international community as a whole to join the countries of our region in a far more relevant reflection to define an effective course of action in the fight against drug and arms trafficking in the Sahel.\",\n\"Despite the major challenges we face, as described earlier, the Niger intends to remain a solid and stable State, backed by democratic institutions, and resolutely committed to the fight against poverty and for development. Our deep conviction in that regard is that the most effective means of combating poverty consists in promoting education. The development challenges for the Niger are summarized by the following simple statistics: an annual population growth rate of 3.9 per cent; an average synthetic fertility index of seven children per woman; a first birth for nearly 50 per cent of girls before the age of 15; and 50 per cent of the population aged 15 on average. Those statistics say much about the state of the education system in our country, and that is why we are committed to act with determination to improve things in this area. As such, the agenda based on which the people of the Niger elected us aims to take action on two fronts: improving access to and the quality of education.\",\n\"With regard to quality, we have decided to place particular emphasis on the quality of teacher training and the professionalization of teaching. With regard to improving access, our policy emphasizes the building of school infrastructure. The other particularly important aspect of our educational programme that I wish to emphasize is aimed at achieving gender equity and building residential facilities for girls in rural schools. Indeed, because of the precarious living conditions of childrenin rural schools, far from their parents, the latter tend to prevent their daughters from continuing their studies. These adolescents are given in marriage as soon as they are out of the education system, which explains some of the previously mentioned statistics. In offering girls the tranquillity and security desired by their parents, residential facilities also offer girls conditions in which to continue their studies with the possibility of staying in school and acquiring the necessary academic and professional skills. The large-scale expansion of residential facilities for girls in rural schools contributes to improved education-system performance and a reduction in the rate of population growth.\",\n\"My reason for highlighting our ambitions in the field of education for a country like the Niger, after having spoken about climate change, terrorism and insecurity in the Sahel, is that we are aware that these three issues — demography, climate change and insecurity — are closely related. Indeed, the terrorism currently at work in the Sahel is connected to the living conditions of certain communities whose environment has been considerably disrupted by climate change. The Sahel is also an area that has experienced particularly high population growth in recent decades. The combination of these two phenomena — demography and climate degradation — has created, as a result of the regional disorder following the fall of Colonel Al-Qaddafi’s regime in 2011, the situation of chaos that the countries of the Sahel are experiencing today. That is why it is not enough for the United Nations and the international community to be moved by or to discuss terrorist violence at length. We must act by investing the necessary resources in education to combat the violence of today and prevent the violence of tomorrow.\",\n\"The Acting President: On behalf of the General Assembly, I would like to thank the President of the Niger for the statement he has just made.\"],\n\"score\": 98}"},"finish_reason":"stop","index":0}]}

输出(样例2):

{"object":"chat.completion","created":1686116956,"model":"gpt-3.5-turbo-0301","usage":{"prompt_tokens":1550,"completion_tokens":1298,"total_tokens":2848},"choices":[{"message":{"role":"assistant","content":"Assembly, I thank the President of the Republic of Niger for his statement just made.\n\n{\"paragraphs\":[\n\"Mali, which in 2011 succumbed to the violence that originated in Libya, has never really recovered. It has in turn become an incubation centre for a form of terrorism that is characteristic of a Sahel that is deeply affected by the effects of climate change preventing the practice of pastoral farming. That ecosystem of violence has had a deadweight effect on the young shepherds of the communities most affected by climate change, among whom a number of terrorist vocations have arisen. This scourge has spread from northern Mali to the Niger and Burkina Faso and is now attempting to spread to the countries of the Gulf of Guinea. The violence has such potential to destabilize State institutions that it resulted in the downfall of the democratically elected regimes of Mali and Burkina Faso in 2020 and 2022, respectively.\",\n\"The Niger, my country, in addition to the terrorist hotbed known as the three-borders region — Mali, the Niger and Burkina Faso — is also facing another terrorist hotbed in the Lake Chad basin, where the various groups that claim to be part of the shadowy Boko Haram operate.\",\n\"Despite the strong pressures related to the circumstances I have just described, my country is showing great resilience centred on very wise governance, the promotion of a culture of tolerance and community cohesion, as well as the rules of democracy and the rule of law. It was that resolute choice of rights and freedoms that made it possible to organize transparent elections that sanctioned the first change of Head of the State by virtue of which a democratically elected President passed the baton to another democratically elected President in April 2021. Need it be said that our experience proves that the surest way to ward off the effects of terrorist violence is to strengthen the democratic regime and nothing else?\",\n\"The war against terrorism now requires us to devote significant resources to increasing the numbers of our forces, acquiring adequate equipment and building the capacities of our soldiers. Allow me to take this opportunity to thank the various partners engaged with us in the fight against terrorism. I would particularly like to thank France, whose action through Operation Barkhane is a major asset in the struggle against our enemies. I also thank the United States of America, the Federal Republic of Germany and all the other countries committed alongside us in various ways, all equally valuable.\",\n\"That being said, I find it crucial to point out that the international community’s commitment to the fight against terrorism in the Sahel has shortcomings that need to be identified and addressed. Indeed, this terrorism derives a large part of its financial resources from trans-Saharan drug trafficking towards Europe and Asia, via Libya. Most of the weapons flooding the Sahel terrorist violence market also come from Libya. I believe that we are faced with a problem that is not particularly complicated. Why then has it not yet been possible to set up an adequate system with the necessary means to combat this phenomenon properly? It is indeed time for collective reflection. It is time for the great Powers present in that region and the international community as a whole to join the countries of our region in a far more relevant reflection to define an effective course of action in the fight against drug and arms trafficking in the Sahel.\",\n\"Despite the major challenges we face, as described earlier, the Niger intends to remain a solid and stable State, backed by democratic institutions, and resolutely committed to the fight against poverty and for development. Our deep conviction in that regard is that the most effective means of combating poverty consists in promoting education. The development challenges for the Niger are summarized by the following simple statistics: an annual population growth rate of 3.9 per cent; an average synthetic fertility index of seven children per woman; a first birth for nearly 50 per cent of girls before the age of 15; and 50 per cent of the population aged 15 on average. Those statistics say much about the state of the education system in our country, and that is why we are committed to act with determination to improve things in this area. As such, the agenda based on which the people of the Niger elected us aims to take action on two fronts: improving access to and the quality of education.\",\n\"With regard to quality, we have decided to place particular emphasis on the quality of teacher training and the professionalization of teaching. With regard to improving access, our policy emphasizes the building of school infrastructure. The other particularly important aspect of our educational programme that I wish to emphasize is aimed at achieving gender equity and building residential facilities for girls in rural schools. Indeed, because of the precarious living conditions of childrenin rural schools, far from their parents, the latter tend to prevent their daughters from continuing their studies. These adolescents are given in marriage as soon as they are out of the education system, which explains some of the previously mentioned statistics. In offering girls the tranquillity and security desired by their parents, residential facilities also offer girls conditions in which to continue their studies with the possibility of staying in school and acquiring the necessary academic and professional skills. The large-scale expansion of residential facilities for girls in rural schools contributes to improved education-system performance and a reduction in the rate of population growth.\",\n\"My reason for highlighting our ambitions in the field of education for a country like the Niger, after having spoken about climate change, terrorism and insecurity in the Sahel, is that we are aware that these three issues — demography, climate change and insecurity — are closely related. Indeed, the terrorism currently at work in the Sahel is connected to the living conditions of certain communities whose environment has been considerably disrupted by climate change. The Sahel is also an area that has experienced particularly high population growth in recent decades. The combination of these two phenomena — demography and climate degradation — has created, as a result of the regional disorder following the fall of Colonel Al-Qaddafi’s regime in 2011, the situation of chaos that the countries of the Sahel are experiencing today.\",\n\"That is why it is not enough for the United Nations and the international community to be moved by or to discuss terrorist violence at length. We must act by investing the necessary resources in education to combat the violence of today and prevent the violence of tomorrow.\",\n\"The Acting President: On behalf of the General Assembly, I thank the President of the Republic of Niger for his statement just made.\"],\"score\":90}"},"finish_reason":"stop","index":0}]}
liyongsea commented 1 year ago

Write a report of errors

liyongsea commented 1 year ago

Solved by https://github.com/liyongsea/parallel_corpus_mnbvc/pull/32