naver / biobert-pretrained

BioBERT: a pre-trained biomedical language representation model for biomedical text mining
667 stars 88 forks source link

How do you pre-process the PMC articles? #25

Open LeoWood opened 3 years ago

LeoWood commented 3 years ago

Hi, i have a question that the number of PMC articles is huge and the pre-process procedure requires sentences segmentation for paragraphs, so how do you finish your sentence segmentation quickly?