hooshvare / parsbert

🤗 ParsBERT: Transformer-based Model for Persian Language Understanding
https://doi.org/10.1007/s11063-021-10528-4
Apache License 2.0
331 stars 38 forks source link

How to use NER for large dataset ? #13

Closed Mohammadtvk closed 3 years ago

Mohammadtvk commented 3 years ago

Hi,

I want to use your pretrained model for NER task but there is a problem, in the tutorial notebook use feed documents one-by-one and this takes too long for my dataset. how can I use it more efficiently ? can i use padding to feed larger batches to the model?

m3hrdadfi commented 3 years ago

Hi,

As far I got, if you wanted to use ParsBERT in a down-stream NLP task, in this case, your large NER dataset, you must fine-tune the pre-trained model on that. There is no need to train BERT again. Moreover, regarding the Sentiment Analysis notebook you mentioned, I used a batch size of 16 for train, validation, and test parts (Configuration -> # general config)!

Follow this example to understand how it uses BERT or other pre-trained models in the NER case.

mitramir55 commented 3 years ago

Hi, One quick note; if you're not thinking about fine-tuning you can use hazm or simply .split('.') to separate sentences and then give them to the BERT to get all your data through the model and then join all the sentences in each record at last in a for loop for NER related tasks.

Mohammadtvk commented 3 years ago

Back after a while .. Thanks for the replies.

I have another qeustion, is it possible to use transformer based models for feature extraction like word2vec or doc2vec? I have used FeatureExtractionPipeline with several pretrained model. but there are 2 problems: 1. I can't set the embedding size(and models' embedding sizes are small) and 2. It take so much time (around 0.5sec) for each document just to extract word embeddings