MikeDoes / EPO-Hackahton

4 stars 0 forks source link

First training task: supervised prediction of 1 class with patent-BERT #12

Open Oblynx opened 1 year ago

Oblynx commented 1 year ago

Let's use one of these models:

Let's generate a small test dataset.

Goal: supervised classification for the class "CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT": Y02W

MikeDoes commented 1 year ago

https://worldwide.espacenet.com/patent/cpc-browser#!/CPC=Y02W

MikeDoes commented 1 year ago

Source: claims Label: 1 or 0 for wether it's part of the Y02W

Oblynx commented 1 year ago

Ambitious goal: unsupervised training.

Example: Word2Vec creates an embedding of the class description, and we train BERT using this embedding in the loss function.

Workflow: