Open dangkh opened 6 months ago
Thank you for your comment on the dataset and further development idea of the visualization.
Regarding the dataset:
Our project aims to produce a global model about weather, therefore the dataset of 154 countries at the state/province level is good/ specific enough ( this is currently the best source that we have found) for the input. Honestly, we do not clearly understand what you mean by "collection criteria". If "collection criteria" means criteria to evaluate the credibility sources of data, we did collect it from Worldbank.org, which is famous for several credible datasets, so the reliability of the data is guaranteed. Actually, if you visit the link that we attached in the proposal, the dataset is extremely huger than what we push on this repo, however, due to the specific purpose and scope of the project, we just leverage a small part of it.
For NaN values, we agree that there might be some of provinces/ states whose entries are empty. We propose 2 alternatives for them. First, we will let them be non-interactive grey areas on the globe which indicates that the dataset of them are not available. Secondly, we can average the values of NaN-areas' surrounding neighbors to "simulate" its expected value, then use them to fill up the dataset.
The choice of a decade-long visualization is most for the reduction of model complexity ( 3D rendering takes a long time).
Concerning visualization:
Thanks Group I for the feedback.
I think handling NaN value is an important step. If done correctly, it helps to enhance the robustness of your prediction. Given the time period available, we might not need to do predictions for all the regions available, but instead, focus on those with complete data first
I also feel that collection criteria is a vague term, so @tranvu21 can contact team I if you want to clarify more regarding Team I's feedback
Note that you don't need to reply to my comment, good luck with your project
This group has proposed an interesting idea. However, there are still some flaws that need to be discussed.
Regarding the dataset:
While the data is available, there are questions about the collection criteria.
Additionally, there might be some NaN values present. Could you elaborate on the techniques your group plans to use to address these?
Furthermore, why did your group collect data spanning 26 years but only simulate changes in weather factors over a period of 10 years?
Concerning visualization: