Question about subtask-decomposition training

Leafaeolian commented 5 days ago

Thanks for your excellent work! This work somewhat like a milestome in perturbation prediction.

The question confused me is the code concering subtask-decomposition training. It seems like subtask1 is seperated with others while subtask2 and subtask3 are trained jointly like multi-task strategy. Its my curiosity that why subtask2 and 3 aren't similarly separated, or why they aren't all linked together like a complete multi-task strategy. Is there a evidence to show the setting is better?

GaoYiChengTJ commented 4 days ago

Thanks for your interest. We have performed ablation studies to illustrate this point in our supplementary information. (https://www.nature.com/articles/s43588-024-00698-1)

Leafaeolian commented 4 days ago

Sorry, I only find Supplementary Figure. 5 and Supplementary Figure. 6 as ablatrion study which focus on ablating the specific subtask. Could you tell me the detail location in supplementary information?

GaoYiChengTJ commented 4 days ago

Subtask-2 can be considered as the coarse-grained version of subtask-3, as it only focus on the direction of DEG genes. Subtask-1 is the most important part of STAMP, however, joint training will introduce large noises to the following subtasks at the initial training stage. From our primary testing, this will lead to a lower performance of STAMP.

Leafaeolian commented 4 days ago

waa great! Many thanks, this really help me understand the paper!

Leafaeolian commented 3 days ago

Subtask-2 can be considered as the coarse-grained version of subtask-3, as it only focus on the direction of DEG genes. Subtask-1 is the most important part of STAMP, however, joint training will introduce large noises to the following subtasks at the initial training stage. From our primary testing, this will lead to a lower performance of STAMP.

hi author, im find a new confused issue. In my first impression, seperation of "subtask1" and "subtask2+subtask3" is only occur in training stage. But in prediction stage, it seems also take true label of "subtask1" instead of using predicted label for "subtask2+subtask3". In real world setting, its no true label of which genes are diffirential expressed. Did I misundertand? (Code is attached below)

‘’‘ class STAMP: …… def prediction(self, test_file_path, combo_test = False): …… output_1 = self.best_model_firstlevel.eval()(batch_x_test[0][1].to(self.device)) output1.append(output_1.cpu()) labels1.append(batch_x_test[1][0].squeeze(1).float()) output_2, mask, hids = self.best_model_secondthirdlevel.second_level_layer.eval()(batch_x_test[1][0].squeeze(1).float().to(self.device), batch_x_test[0][1].to(self.device)) output2.append(output_2.squeeze(-1).cpu()) labels2.append(batch_x_test[1][1].squeeze(1).float()) output_3, mask = self.best_model_secondthirdlevel.third_level_layer.eval()(hids, mask) output3.append(output_3.squeeze(-1).cpu()) labels3.append(batch_x_test[1][2].squeeze(1).float()) ’‘’

GaoYiChengTJ commented 2 days ago

Yes, we need to use the true label of subtask-1 to evaluate the performance on the "subtask2+subtask3", as we are focused on the performance of DEGs. For evaluating the accuracy of identifying DEGs, we use the predicted score of subtask1 given the true label of subtask-1. In real application cases, you can directly use the output of model, as it's not involved in the benchmarking.

Leafaeolian commented 2 days ago

Thanks for your reply. But im still confused if subtask-1 is all seperated with subtask-2 &-3 in training and testing, how subtask-1 benefit subtask-2&-3?

GaoYiChengTJ commented 2 days ago

We used the DEGs identified by statistical methods to constrain the model's learning for subtask-2&3, which can improve the signal-noise ratio to a certain extent. Intuitively, we hope the model should not be focused on non-DEGs , as it can be considered as fitting noise signal.

Leafaeolian commented 2 days ago

Many thanks!

bm2-lab / STAMP

Question about subtask-decomposition training #1