OpenNyAI / Opennyai

Opennyai : An efficient NLP Pipeline for Indian Legal documents
MIT License
67 stars 9 forks source link

[SUGGESTION] Can we run the NER Model on the Judgment Summaries? #2

Closed d-saikrishna closed 1 year ago

d-saikrishna commented 1 year ago

System Information

opennyai 0.0.10 Python 3.9.13 Ubuntu 22.04.1 LTS

Description

NER on the entire judgment sometimes is not helping our use case. For instance, NER will give us all STATUTES and PROVISIONS mentioned in a judgment copy. But, often judges just refer to a lot of Statues and Provisions -- many of them may not be relevant to the case at hand.

How I tried to solve for it is:

  1. I used the Extractive Summariser to get 5 types of summaries first.
  2. Our legal researcher said that the STATUTES and PROVISIONS mentioned in the Preamble summary and Decision Summary are the relevant ones. So I tried to run NER only on these two summaries as input.

But, OpenNyAI's NER often fails because - it needs to be able to find the PREAMBLE part of the judgment to perform an NER. Obviously, when I supply a summary, the model fails to identify any PREAMBLE.

So, I just used regex on the Preamble and Decision summaries to find our relevant STATUTES and PROVISIONS. It would have been nicer to use NER instead.

amant555 commented 1 year ago

Hi @d-saikrishna, That look like an interesting thought. On the point of NER not working. It should in the case you have given even if preamble is not identified. I would suggest setting ner_do_sentence_level, ner_do_postprocess to False. This should solve the issue of error. Let me know if the issue remains.

prathameshk commented 1 year ago

There is also a simpler way to do this. If you are just looking at extracting entities from sentences which are in summary and has particular RR then it is easier to just filter the output of opennyai. For each sentence, there is a flag indicating if it should be in summary and also has entities in that sentence. So pick up all the sentences which are in summary and has RR of decision, preamble and then only pick up statutes and provisions from there.

On Tue, Mar 28, 2023 at 9:42 PM Sai Krishna @.***> wrote:

System Information

opennyai 0.0.10 Python 3.9.13 Ubuntu 22.04.1 LTS Description

NER on the entire judgment sometimes is not helping our use case. For instance, NER will give us all STATUTES and PROVISIONS mentioned in a judgment copy. But, often judges just refer to a lot of Statues and Provisions -- many of them may not be relevant to the case at hand.

How I tried to solve for it is:

  1. I used the Extractive Summariser to get 5 types of summaries first.
  2. Our legal researcher said that the STATUTES and PROVISIONS mentioned in the Preamble summary and Decision Summary are the relevant ones. So I tried to run NER only on these two summaries as input.

But, OpenNyAI's NER often fails because - it needs to be able to find the PREAMBLE part of the judgment to perform an NER. Obviously, when I supply a summary, the model fails to identify any PREAMBLE.

So, I just used regex on the Preamble and Decision summaries to find our relevant STATUTES and PROVISIONS. It would have been nicer to use NER instead.

— Reply to this email directly, view it on GitHub https://github.com/OpenNyAI/Opennyai/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7D2CKNO3PE2OWHNIVQSOTW6MEYVANCNFSM6AAAAAAWKZH6DM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Thanks and regards

Prathamesh Kalamkar

d-saikrishna commented 1 year ago

Thank you @amant555 @prathameshk. Will check these approaches in my work ahead!