Peer review for proposal - Group I

dangkh commented 6 months ago

This group has proposed an interesting idea. However, there are still some flaws that need to be discussed.

Regarding the dataset:

While the data is available, there are questions about the collection criteria.
Additionally, there might be some NaN values present. Could you elaborate on the techniques your group plans to use to address these?
Furthermore, why did your group collect data spanning 26 years but only simulate changes in weather factors over a period of 10 years?

Concerning visualization:

To predict the future, your group should carefully prepare the model, considering the relatively small amount of collected data.
I think your group could include charts for comparison by regions, such as South Asia versus Southeast Asia.

tranvu21 commented 6 months ago

Thank you for your comment on the dataset and further development idea of the visualization.

Regarding the dataset:

Our project aims to produce a global model about weather, therefore the dataset of 154 countries at the state/province level is good/ specific enough ( this is currently the best source that we have found) for the input. Honestly, we do not clearly understand what you mean by "collection criteria". If "collection criteria" means criteria to evaluate the credibility sources of data, we did collect it from Worldbank.org, which is famous for several credible datasets, so the reliability of the data is guaranteed. Actually, if you visit the link that we attached in the proposal, the dataset is extremely huger than what we push on this repo, however, due to the specific purpose and scope of the project, we just leverage a small part of it.
For NaN values, we agree that there might be some of provinces/ states whose entries are empty. We propose 2 alternatives for them. First, we will let them be non-interactive grey areas on the globe which indicates that the dataset of them are not available. Secondly, we can average the values of NaN-areas' surrounding neighbors to "simulate" its expected value, then use them to fill up the dataset.
The choice of a decade-long visualization is most for the reduction of model complexity ( 3D rendering takes a long time).

Concerning visualization:

I appreciate your suggestion of additional charts for some specific FAQs about the weather, we will consider it as a "nice-to-have" feature in our project

tienvu95 commented 6 months ago

Thanks Group I for the feedback.

I think handling NaN value is an important step. If done correctly, it helps to enhance the robustness of your prediction. Given the time period available, we might not need to do predictions for all the regions available, but instead, focus on those with complete data first

I also feel that collection criteria is a vague term, so @tranvu21 can contact team I if you want to clarify more regarding Team I's feedback

Note that you don't need to reply to my comment, good luck with your project

GitHub-Traveler / COMP4010_Project2

Peer review for proposal - Group I #2

Regarding the dataset:

Concerning visualization: