NorskRegnesentral / text-anonymization-benchmark

Annotated corpus + evaluation metrics for text anonymisation
MIT License
51 stars 7 forks source link

Sharing code of experiments with RoBERTa, Presidio, and Longformer #5

Closed mariasierro closed 1 year ago

mariasierro commented 1 year ago

Hello! I have recently read your paper and I found your work very interesting. In fact, I might use it as a basis for building a corpus of legal texts in French and Spanish language for anonymization. Thank you for having shared the annotated corpus as well as the code for evaluation. Would it be possible to also share the code that you used in your anonymization experiments with RoBERTa language model, Presidio, and fine-tuned Longformer? It would be awesome :) I would like to be able to replicate your experiments before starting to make my own.

anthipapa commented 1 year ago

Hi and thank you for your interest! Of course, I'm currently working on this, and will upload the relevant code soon! :)

anthipapa commented 1 year ago

Hi again! The code is now uploaded. Feel free to ask any questions or clarifications directly at anthip@ifi.uio.no

mariasierro commented 1 year ago

That's great, thank you very much!