Open jacekplocharczyk opened 4 years ago
One comment about data licensing:
TODOs:
One comment about data licensing:
- Licencing of the data: The dataset is publically shared on GitHub by John Hopkins University https://github.com/CSSEGISandData/COVID-19, there are also conditions of license. (educational and academic purposes, hope competition fits in this definition)
For me, it should be OK. @Mindgames what do you think?
- RMSLE is the most obvious and safe choice for evaluation.
So let's use it. I've added it to the main post.
- It's a tricky part because we can use only public data and people can easily submit scrapped samples. So, I am not sure how it should work out, I think the only option to have interactive leaderboard is to use smth like the last day of stats.
Participants send us their predictions for the next week in advance. We store them and rate day by day updating leaderboard.
- Why not both?
For some, it could be controversial. There is some disclaimer at the Kaggle competition website:
We understand this is a serious situation, and in no way want to trivialize the human impact this crisis is causing by predicting fatalities. Our goal is to provide better methods for estimates that can assist medical and governmental institutions to prepare and adjust as pandemics unfold.
If we will look professional it's OK for me.
@jacekplocharczyk
Participants send us their predictions for the next week in advance. We store them and rate day by day updating leaderboard.
I meant an interactive part where participants send their .csv and see the results on the leaderboard instantly.
Like we can have the last day without labels and hope that people will not scrape them, then participants can compare how good they are against others.
What about creating a leader board with two columns: training data error (last 7 days) and upcoming data error (next 7 days and the data from the first days of the hackathon - day 0).
We could also count 0-day error as only 10% so it shouldn't be the deciding factor.
Edit. @Mindgames what countries do we care about? Should teams predict only worldwide data (won't be so insightful) or let's make few categories, e.g.:
I've added also idea of adding climate data for each country to the main post.
@jacekplocharczyk
- Training data error would give only intuition on how well a team is doing - not influencing the final place.
Training data will be very misleading, cause it's only the measurement of overfitting, not how well teams are doing. That means that you will have a leaderboard where people with bad and overfitted models will be on top, which takes away the main reason for having leaderboard - compare performance with others.
- Upcoming data error would be the main target to minimize. We could give teams first results (day 0) after they would be reviled (probably a few hours after midnight or at 8 am. Event ends at 11 am).
You are right, we should update scores after data is available, then we will make it more interactive. However, I think that we should use this 0-day error as part of the public leaderboard since our metric is just an average value we can just recalculate the average scores with new data.
We could also count 0-day error as only 10% so it shouldn't be the deciding factor.
Why not just ignore them when calculating final scores, then nobody will be able to get extra from cheating?
I've added also idea of adding climate data for each country to the main post.
:+1:
@raznem
Training data will be very misleading, cause it's only the measurement of overfitting, not how well teams are doing. That means that you will have a leaderboard where people with bad and overfitted models will be on top, which takes away the main reason for having leaderboard - compare performance with others.
You are right but for me, it's only for checking if a submission was sent successfully. Of course, teams should be aware that they can overfit on this train leaderboard but since it is only meaningful until we get 0-day results I wouldn't be bothered.
We could also replace train leaderboard with a 0-day leaderboard in the morning. <- this is a good idea for me.
- Upcoming data error would be the main target to minimize. We could give teams first results (day 0) after they would be reviled (probably a few hours after midnight or at 8 am. Event ends at 11 am).
You are right, we should update scores after data is available, then we will make it more interactive. However, I think that we should use this 0-day error as part of the public leaderboard since our metric is just an average value we can just recalculate the average scores with new data.
We could also count 0-day error as only 10% so it shouldn't be the deciding factor.
Why not just ignore them when calculating final scores, then nobody will be able to get extra from cheating?
So do we agree that our plan will be the following:
@jacekplocharczyk
You are right but for me, it's only for checking if a submission was sent successfully. Of course, teams should be aware that they can overfit on this train leaderboard but since it is only meaningful until we get 0-day results I wouldn't be bothered.
If you will check kaggle or other competitions with leaderboards they have public leaderboard during all event. This creating competition spirit causes you can see how good you are right now against others. If you want to check whether the submission is correct you can just accept results without any leaderboard, just send feedback that it's correct. However, In case when we want to keep public leaderboard if we will not use some hidden labels it will be useless till 8 AM which is close to the end of the competition. So if we will accept the results from the train it shouldn't be public cause it only can mislead some people.
From my perspective, we have a tradeoff between more data to participants and more competition from the beginning. I think that 1-day data is worth losing to have competition from the start, not waiting till 8 AM which is close to the end. How do you see this tradeoff?
We could also replace train leaderboard with a 0-day leaderboard in the morning. <- this is a good idea for me.
Agree, we can update results live, during the competition itself to make more intense :)
- Update leader board every day after hackathon at 8 AM
Maybe let's update it not so early in the morning in case smth will be broken? :)
DLL Global #1
04.04.2020
This time we will have a standard deep learning challenge.
General Info:
Challenge: Predict COVID-19 spreading across specific countries and worldwide Criterion: Root mean squared logarithmic error (RMSLE)
Additional challenge: Find meaningful insights about COVID-19 (an open question)
Test set: see general TODOs
Dataset info/TODOs:
.csv
files directly from their pageGeneral TODOs:
We will use RMSLE for evaluation.
We will update this post when we find out more.