Hierarchical bert - Githubissues

coastalcph / lex-glue

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

186 stars 36 forks source link

Hi @lvaleriu,

That's right. Hierarchical BERT (see the implementation of HierarchicalBertat https://github.com/coastalcph/lex-glue/blob/main/models/hierbert.py) considers a list of text segments (e.g., sentences, paragraphs). Each segment is initially parsed (encoded) on its own by the pre-trained model (namedencoder in the code) and then a second-order Transformer (named seg_encoder in the code) fuses segment encodings to produce a final document representation.

In the examined tasks, we have gold-standard factual paragraphs in the case of ECtHR A/B, and silver-standard new-line separated paragraphs in the case of SCOTUS.

Such questions should better be discussed in the Discussions section 🤗

coastalcph / lex-glue

Hierarchical bert #1