allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.72k stars 229 forks source link

How would someone replicate scispacy vectors? #416

Closed azhx closed 2 years ago

azhx commented 2 years ago

https://github.com/allenai/scispacy/blob/main/docs/index.md Talks about the datasources used but im not entirely clear on how exactly the vector tables were trained for, say, en_core_sci_md vs en_core_sci_lg.

I didn't seem to find anything in the original scispacy paper either. What specific algorithms/libraries were used, etc?

dakinggg commented 2 years ago

Please see https://github.com/allenai/scispacy/issues/83