aphp / edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.
https://aphp.github.io/edsnlp/
BSD 3-Clause "New" or "Revised" License
111 stars 29 forks source link

Add overlap_policy='merge' option to make_sentence_span_getter #262

Closed percevalw closed 6 months ago

percevalw commented 6 months ago

Description

@aricohen93

Training config example:

[components.qualifier.embedding.embedding.embedding]
@factory = "eds.transformer"
model = "camembert-base"
window = 255
stride = 128
span_getter = {
    "@misc": "eds.span_context_getter",
    "span_getter": ${components.qualifier.embedding.span_getter},
    "context_words": 30,  # add 30 words on each side
    "context_sents": 2, #  ent sent + 1 on each side
    "overlap_policy": "merge"
    }

Checklist

codecov[bot] commented 6 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 96.96%. Comparing base (084aa4f) to head (ce71be8).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #262 +/- ## ========================================== + Coverage 96.95% 96.96% +0.01% ========================================== Files 255 255 Lines 8591 8622 +31 ========================================== + Hits 8329 8360 +31 Misses 262 262 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.