HKUDS / UrbanGPT

[KDD'2024] "UrbanGPT: Spatio-Temporal Large Language Models"
https://urban-gpt.github.io
Apache License 2.0
243 stars 31 forks source link

Clarification Needed: Is the Bike Flow Prediction Example Truly Zero-Shot? #24

Open Kaleemullahqasim opened 5 days ago

Kaleemullahqasim commented 5 days ago

I came across the Example-1: Bike Flow Prediction (Zero-shot scenario) in your paper, and I have some concerns regarding the classification of this task as “zero-shot.”

As I understand it, a zero-shot scenario typically involves the model performing a task without being provided any specific prior examples or historical data that directly relate to the task. However, in the provided bike flow prediction example, the model is given 12 time steps of historical inflow and outflow data. This seems to provide the model with concrete examples to base its predictions on, which would generally classify the task as a few-shot or data-driven prediction rather than a zero-shot task.

Could you clarify why this is being labeled as a zero-shot scenario? If it is indeed zero-shot, could you explain the reasoning behind the classification, given that historical data is provided for the prediction task?

Thank you for your time and help in clearing this up!

image

Kaleemullahqasim commented 5 days ago

https://huggingface.co/datasets/bjdwh/ST_data_urbangpt/viewer?row=0

even the training dataset is not zero-shot, you did add few-shot examples, but for some reason you always specified they are zeroshots

LZH-YS1998 commented 4 days ago

Hello. We understand the aspects that may be causing confusion. The spatio-temporal prediction task involves forecasting future values based on historical data, whether it occurs in full-shot, few-shot, or zero-shot settings. In this context, we differentiate the zero-shot setting from others based on whether the model has been trained on the historical data from the test regions. If the model has not been trained with data from the test regions, it is considered a zero-shot setting. We have formalized this task in Session 2, titled "Spatio-Temporal Zero-Shot Learning."

Kaleemullahqasim commented 3 days ago

Thank you for clarifying the concept of “Spatio-Temporal Zero-Shot Learning” in your study. I appreciate your explanation of how the zero-shot setting is defined based on the absence of training on the specific test regions.

However, after reviewing the approach, I have a follow-up question. It appears that the model still relies on historical data from the test region for making predictions, which can be viewed as a form of few-shot learning. If the goal is to generalize to new regions without explicit training, could this not be achieved by carefully crafting few-shot prompts that provide the model with the necessary historical data and context during inference?

Why fine-tune or train the model?if prompt with a few shots can fix the whole problem? Which in this case you are still using at the testing stage or prediction stage?