Closed JulienGaumez closed 1 week ago
Looks good. Can you please always show an example json output for each subset for checking in the PR?
Like this?
{"dataset_name": "AGBDE", "subset_name": "agb_de_judgment", "source": "https://huggingface.co/datasets/d4br4/agb-de", "instruction_language": "en", "prompt_language": "de", "answer_language": "en", "jurisdiction": "GERMANY", "task_type": "TEXT_CLASSIFICATION", "downloaded_timestamp": "07-05-2024", "instruction": "Can the following clause be enforced by the parties of the contract under the german governing law? Seperate clearly whether it is valid or potentially void.", "prompt": "Clause: \u00a72 Vertragsabschluss2.5\r\n\r\nSollte <
{"dataset_name": "AGBDE", "subset_name": "agb_de_mc", "source": "https://huggingface.co/datasets/d4br4/agb-de", "instruction_language": "en", "prompt_language": "de", "answer_language": "en", "jurisdiction": "GERMANY", "task_type": "MULTIPLE_CHOICE", "downloaded_timestamp": "07-05-2024", "instruction": "Determine the suitable topic or topics for this clause within a German consumer contract. Return your selection as 'Topic(s):'. Please use only the following topics: age, applicability, applicableLaw, arbitration, changes, codeOfConduct, conclusionOfContract, delivery, description, disposal, intellectualProperty, liability, party, payment, personalData, placeOfJurisdiction, prices, retentionOfTitle, severability, textStorage, warranty, withdrawal", "prompt": "Clause: \u00a72 Vertragsabschluss2.5\r\n\r\nSollte <
Yes, this looks great, thanks.
This code takes care of the Issue: https://github.com/JoelNiklaus/LawInstruct/issues/18 I included two subsets with this dataset. The first is about the task of judging whether the given clause is potentially void while the second one is about selecting the right topic(s) to the clauses.
Note that I also edited 1 line in the instruction manager to fix encoding issues I encountered on Windows.