scrapinghub / webstruct

NER toolkit for HTML data
256 stars 59 forks source link

Pretrained Models #37

Open NightFury13 opened 7 years ago

NightFury13 commented 7 years ago

Webstruct looks like a really cool extension to have for any scraping enthusiast, so thank you for creating this! It would be really awesome if you guys could also release some pre-trained models along with this library. It's not feasible for every user to have loads of annotated data and what people generally are looking for are the most common entities (NAME, PLACE, ORGANISATION, etc). A humble suggestion :smile:

manugarri commented 7 years ago

what do you mean by models?

NightFury13 commented 7 years ago

Instead of having to annotate and train on that data, can we simply load a configuration/parameter file (model) instead and test new data against it? A prebuilt NER engine, that's what I meant from a trained model.

On Mon, 13 Feb 2017, 10:10 p.m. Manuel Garrido, notifications@github.com wrote:

what do you mean by models?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scrapinghub/webstruct/issues/37#issuecomment-279446912, or mute the thread https://github.com/notifications/unsubscribe-auth/AFem9ePka3c9kEp97mbnaBN0TJ_g5COtks5rcIdrgaJpZM4L_EzI .

rmotsar commented 6 years ago

It's still actual question

HAMZA310 commented 3 years ago

Is a generic pretrained model available? A model that has already been trained on sufficient annotated HTML data, and can be used quickly without requiring any training.