Regarding specific parameters and stages in the pre-training phase

tsinghua-fib-lab / GPD

The official implementation of the ICLR 24 submission entitled "Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation".

40 stars 4 forks source link

Regarding specific parameters and stages in the pre-training phase #2

Closed NING-CSU closed 7 months ago

NING-CSU commented 7 months ago

Thanks for your invaluable contribution to the open-source community through your excellent work.

As I delve into your project, particularly the pre-training phase, I've encountered a couple of aspects that I'd like to seek clarification on:

Regarding the node_index = 15 parameter setting in main.py: Could you please provide insights into the reasoning behind choosing this specific value?

The "stage" variable in the code can be assigned different parameters such as source, task1, target, target_maml, task2, test, and dann. However, I'm confused about the specific meanings associated with each of these stages. It would be immensely helpful if you could provide a brief explanation outlining the purpose and significance of each stage.

Looking forward to your reply!

packer-c commented 7 months ago

Hi, author. I have the same confusion why you set node_index = 15. Does it not need to training all the node_index?

And, when running CUDA_VISIBLE_DEVICES=0 python main.py --taskmode task4 --model v_GWN --test_data metr-la --ifnewname 1 --aftername TrafficData, it seems to only train the target_dataset (metr-la)? But in the paper, I found that "A collection of optimized spatio-temporal prediction models based on the dataset of source cities (Figure. 1)". Could you also clarify it? Thanks in advance!

PLUTO-SCY commented 7 months ago

To packer-c, it is needed to training all the node_index?, and the '15' is a small bug which has been corrected. And actually the 'task4' is the instruction in the pre-training stage. At this time, 'test_data' is one of multiple source cities, and we train each source city separately. In our subsequent code version, we can consider your suggestion and unify the training of multiple source cities with a single instruction.

packer-c commented 7 months ago

To packer-c, it is needed to training all the node_index?, and the '15' is a small bug which has been corrected. And actually the 'task4' is the instruction in the pre-training stage. At this time, 'test_data' is one of multiple source cities, and we train each source city separately. In our subsequent code version, we can consider your suggestion and unify the training of multiple source cities with a single instruction.

Thanks for your clarification!

Will you also provide an updated code which combines all the source cities parameters into one .npy file? I think, the current model2tensor.py maybe only can generate the corresponding .npy from one source dataset. Because the paper wrote it from multiple source city datasets.

PLUTO-SCY commented 7 months ago

To NING-CSU:

Thanks for pointing out a small bug in the code. This 15 was a special case that was randomly made during the code review phase, and I finally forgot to delete it. I've removed this line of code.

For the "stage" variable: Sorry to confuse you with the way we named them, but most of the names were in the middle of our development and iteration phase and are no longer used now(like dann). In the latest version of the code, I changed it to singlePretrain, test, target, their meanings are as follows:

singlePretrain: In the pre-training phase, a source dataset is trained with 70% (sufficient) data.
test: Test, test with 30% of the data.
target: After the diffusion model generates the parameters, fine-tune the target city with the specified number of days of data.

PLUTO-SCY commented 7 months ago

To packer-c, it is needed to training all the node_index?, and the '15' is a small bug which has been corrected. And actually the 'task4' is the instruction in the pre-training stage. At this time, 'test_data' is one of multiple source cities, and we train each source city separately. In our subsequent code version, we can consider your suggestion and unify the training of multiple source cities with a single instruction.

Thanks for your clarification!

Will you also provide an updated code which combines all the source cities parameters into one .npy file? I think, the current model2tensor.py maybe only can generate the corresponding .npy from one source dataset. Because the paper wrote it from multiple source city datasets.

Sure. The current model2tensor.py extracts parameters from the model of one dataset, which I will later modify to a one-click extraction of multiple source cities. Thanks for your suggestions!

NING-CSU commented 7 months ago

To NING-CSU:

Thanks for pointing out a small bug in the code. This 15 was a special case that was randomly made during the code review phase, and I finally forgot to delete it. I've removed this line of code.

For the "stage" variable: Sorry to confuse you with the way we named them, but most of the names were in the middle of our development and iteration phase and are no longer used now(like dann). In the latest version of the code, I changed it to singlePretrain, test, target, their meanings are as follows:

singlePretrain: In the pre-training phase, a source dataset is trained with 70% (sufficient) data.

test: Test, test with 30% of the data.

target: After the diffusion model generates the parameters, fine-tune the target city with the specified number of days of data.

Thanks for your clarification!

I still have a query regarding the parameter test_data in your project.

In the pre-training phase, if only the "metr-la" dataset is utilized as the source city data, should the targetDataset parameter also be set to "metr-la" when running 1Dmain.py to learn the relevant parameters of "metr-la"?

Conversely, during the fine-tuning phase, when running main.py, should test_data be set to a dataset other than "metr-la"? It seems to me that "metr-la" cannot be both the source and target city simultaneously.

However, in the README.md file, it's mentioned that the datasets used in pre-training, training the diffusion model, and fine-tuning are all "metr-la." Could you kindly provide clarification on this matter?

Looking forward to your response.

PLUTO-SCY commented 7 months ago

In current version of 1Dmain.py, if you set metr-la as 'targetDataset', then the source datasets will be the remaining three datasets, PMS-Bay, Didi-Chengdu, and Didi-Shenzhen. You can see my differentiation and extraction of the source and target index in datapreparing.py. That said, the current code does not support the freedom to set one data set as the source. If you want to do this, you can modify the node selection strategy in datapreparing.py.

In the fine-tuning phase, if your target dataset is metr-la, than you should just set the test_data as metr-la.

In readme.md, I'm just using metr-la as a example for each scenario. Ha-ha. So, if you want to set 'metr-la' as target city:

In pretrain: you should set the test_data as PMS-Bay, Didi-Chengdu, and Didi-Shenzhen respectively to pretrain the models of other three source cities.
In Diffusion: you should set the targetDataset as 'metr-la'.
In finetune, set the test_dataset as 'metr-la'.

Since finetune and pretraining share the same code framework and use the same set of parameter names, this can be confusing and I will try to make the distinction between them in later versions of the code.

NING-CSU commented 7 months ago

Thanks for your patience clarification! Looking forward to the next version of the code.

NING-CSU commented 7 months ago

In current version of 1Dmain.py, if you set metr-la as 'targetDataset', then the source datasets will be the remaining three datasets, PMS-Bay, Didi-Chengdu, and Didi-Shenzhen. You can see my differentiation and extraction of the source and target index in datapreparing.py. That said, the current code does not support the freedom to set one data set as the source. If you want to do this, you can modify the node selection strategy in datapreparing.py.

In the fine-tuning phase, if your target dataset is metr-la, than you should just set the test_data as metr-la.

In readme.md, I'm just using metr-la as a example for each scenario. Ha-ha. So, if you want to set 'metr-la' as target city:

In pretrain: you should set the test_data as PMS-Bay, Didi-Chengdu, and Didi-Shenzhen respectively to pretrain the models of other three source cities.

In Diffusion: you should set the targetDataset as 'metr-la'.

In finetune, set the test_dataset as 'metr-la'.

Since finetune and pretraining share the same code framework and use the same set of parameter names, this can be confusing and I will try to make the distinction between them in later versions of the code.

Hi, author. Upon reviewing the datapreparing.py, I noticed the need to extract trainid and genid separately from the saved parameter file (.npy). Does this imply that for a specific task, such as traffic speed prediction, the main.py during the pre-training phase should not only run for source cities (PMS-Bay, Didi-Chengdu, and Didi-Shenzhen) but also for the target city (metr-la)? Otherwise, the parameter file would not contain parameters for both the source and target cities simultaneously.

PLUTO-SCY commented 7 months ago

In current version of 1Dmain.py, if you set metr-la as 'targetDataset', then the source datasets will be the remaining three datasets, PMS-Bay, Didi-Chengdu, and Didi-Shenzhen. You can see my differentiation and extraction of the source and target index in datapreparing.py. That said, the current code does not support the freedom to set one data set as the source. If you want to do this, you can modify the node selection strategy in datapreparing.py. In the fine-tuning phase, if your target dataset is metr-la, than you should just set the test_data as metr-la. In readme.md, I'm just using metr-la as a example for each scenario. Ha-ha. So, if you want to set 'metr-la' as target city:

In pretrain: you should set the test_data as PMS-Bay, Didi-Chengdu, and Didi-Shenzhen respectively to pretrain the models of other three source cities.

In Diffusion: you should set the targetDataset as 'metr-la'.

In finetune, set the test_dataset as 'metr-la'.

Since finetune and pretraining share the same code framework and use the same set of parameter names, this can be confusing and I will try to make the distinction between them in later versions of the code.

Hi, author. Upon reviewing the datapreparing.py, I noticed the need to extract trainid and genid separately from the saved parameter file (.npy). Does this imply that for a specific task, such as traffic speed prediction, the main.py during the pre-training phase should not only run for source cities (PMS-Bay, Didi-Chengdu, and Didi-Shenzhen) but also for the target city (metr-la)? Otherwise, the parameter file would not contain parameters for both the source and target cities simultaneously.

I suggest that you dive one more step into the code. The genid, and the corresponding genTarget, are just aimed for checking the error or difference between the final generated result from the real parameter, which is not necessary and you can remove them when solving real transfer task. They also do not have any supervision or guidance over the training process.