HKUDS / UrbanGPT

[KDD'2024] "UrbanGPT: Spatio-Temporal Large Language Models"
https://urban-gpt.github.io
Apache License 2.0
282 stars 39 forks source link

Questions about original dataset #15

Closed zxc2012 closed 2 months ago

zxc2012 commented 2 months ago

Hello. Could you please give a description of each file in the origin ST dataset and how to preprocess it? image

LZH-YS1998 commented 2 months ago

Hello, Sure! You can find the process involving poi and rz_map detailed in the instruction_generate.py

Additionally, the process for handling the ST dataset is outlined in the load_dataset.py

For example, the NYC_taxi dataset is structured as 263 x 105216 x 2, representing the number of regions, the number of timesteps (from 2016 to 2021, with each day divided into 48 timesteps), and the number of features (inflow and outflow), respectively. You can process the dataset as follows: data_taxi_path = os.path.join('st_data/all_nyc_taxi_263x105216x2.npz') data_taxi = np.load(data_taxi_path)['data']

zxc2012 commented 2 months ago

Thanks for that!