geoai-lab / NeuroTPR

NeuroTPR: a Neuro-net ToPonym Recognition model for extracting locations from social media messages
GNU General Public License v3.0
19 stars 7 forks source link

Question about data format/features #4

Open rsuwaileh opened 2 years ago

rsuwaileh commented 2 years ago

Hello,

I looked at the WNUT2017_feature_added.txt file to understand what features I need to prepare to retrain NeuroTPR. I found that it contains four columns (Token, BOI label, NE, and POS).

Can you explain what's exactly the third column/feature that contains NE tags: LOCANTION, PERSON, ORGANIZATION? I noticed they could be the NER tags of Stanford NER classifier (with 3 classes) but not clear to me why and how it's used in training NeuroTPR?

Thanks, Reem

YingjieHu commented 2 years ago

Hi Reem,

Thank you for your interest in our work!

Yes, the third column is Named Entity tags, i.e., LOCANTION, PERSON, ORGANIZATION. We initially used Stanford NER to get these tags which were used as an additional input feature for NeuroTPR. However, after more experiments, we found out these NE tags tend to be erroneous due to the limited length of tweets and their unorganized sentence structures. We eventually dropped this input feature, so it was not used for training the final NeuroTPR model.

Hope this information helps.

Best wishes, Yingjie

From: Reem Suwaileh @.> Sent: Saturday, November 27, 2021 2:01 PM To: geoai-lab/NeuroTPR @.> Cc: Subscribed @.***> Subject: [geoai-lab/NeuroTPR] Question about data format/features (Issue #4)

Hello,

I looked at the WNUT2017_feature_added.txt file to understand what features I need to prepare to retrain NeuroTPR. I found that it contains four columns (Token, BOI label, NE, and POS).

Can you explain what's exactly the third column/feature that contains NE tags: LOCANTION, PERSON, ORGANIZATION? How do you create it? and how it's used in training NeuroTPR?

Thanks, Reem

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/geoai-lab/NeuroTPR/issues/4 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AANMKVB27TQ3JRWSYR5INOTUOETHTANCNFSM5I4PDYDA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .