abhinavkashyap / sciwing

SciWING is a modern toolkit for scientific document processing from WING-NUS
https://www.sciwing.io
MIT License
62 stars 15 forks source link

Other languages #24

Open rodyoukai opened 4 years ago

rodyoukai commented 4 years ago

Hi, is there a way to select other language to analysis documents?

exists pre-trained models in other language? Im interested in Spanish and French

knmnyn commented 4 years ago

Hi, at this time we don't have embeddings for other languages, so unfortunately we don't have support for this. Do let us know if you're interested in helping to develop this. Thanks @abhinavkashyap may be able to comment more about other language word embeddings.

rodyoukai commented 4 years ago

Sure I would like develop train data in other languages, can you help with documentation to learn how to do this, I really like learn a lot about this topic...

abhinavkashyap commented 4 years ago

Hi @rodyoukai . Thank you for contacting us. Yes. Right now. We do not deal with other languages. Are you interested in any particular task in SciWING?

rodyoukai commented 4 years ago

Hi @abhinavkashyap I want to deal with spanish, can you give where can I starter, I am a little lost...

abhinavkashyap commented 3 years ago

Hi @rodyoukai . We are hard at work to make this possible in the future versions of the library. For example citation string parsing which is included with this library, does not work with reference strings from another language. Can you take a look at models/neural_parscit.py? We use an ElmoEmbedder to do this. But that does not work with Spanish for example. In the future we can include pre-trained versions of Spanish transformer for citation string parsing. If you are ready to work on that. I would be happy to chat with you.

rodyoukai commented 3 years ago

Hi again @abhinavkashyap, of course I want to work in this, recently I started my PhD in computer science and this is a problem to I need to solve. As you can see I am not natural english speaker but I learn fast, if you can reccommend some documentation, tutorials or any kind of information, I would appreciate