Closed PolinaPanicheva closed 2 years ago
I wonder if using the GazPNE2 approach can be successful for this task. In a way of linking the tweet and the place name with the corresponding coordinates. https://www.researchgate.net/publication/355711839_GazPNE2_A_general_place_name_extractor_for_tweets_fusing_gazetteers_deep_learning_and_transformer_models https://github.com/uhuohuy/GazPNE2
Em, is it still actual? or I shouldn't wait for a feedback?
Em, is it still actual? or I shouldn't wait for a feedback?
Yes, all of our open challenges are relevant and available to complete. We contact everyone, who has successfully completed the challenge, regarding their further steps with us.
Here is my version https://github.com/StopTestingRightNow/Tweets_Geolocation
@Lavriz successfully solved the challenge and was hired by Inca Digital.
Context
The goal is to create and train a deep learning model which predicts coordinates (latitude, longitude) of individual tweets. You are free to use any approach, but we have a few suggestions. Our current idea is to use a simple Character CNN architecture, that would capture the most prominent character sequences related to location-specific language variety and probably the most common location names. We suggest avoiding using complex linguistic features and structures in this model, specifically, Named Entity Recognition and Linking. Please apply with a rough overview of your model architecture. No hard MSE or EER requirements - we are after a scalable model architecture that will allow us to increase the training dataset size later on.
Development dataset
We have 4M tweets from 3,361 locations covering the South America, written in 2021. Each .csv file is named with the coordinates _(latitudelongitude) and contains the text of the tweet (column text) and some meta-information.
Deliverable
A model which takes a tweet text as input and returns the coordinates as output; the model evaluation metrics obtained on the development dataset, including Mean Absolute Error in kilometers. We will evaluate the model using the test dataset that is not shared here.
Resources
Successful submissions
🎉 @Lavriz successfully solved the challenge and was hired by Inca Digital.