bdqnghi / infercode

[ICSE 2021] - InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees
89 stars 24 forks source link

Code error for subtrees bucket selection while training #9

Closed little-pikachu closed 3 years ago

little-pikachu commented 3 years ago

For the old version, I guess there is an error in line 242 of file infercode/old_version/utils/data/tree_loader.py. To train, the subtrees bucket should be all_subtrees_bucket other than random_subtrees_bucket.

def make_minibatch_iterator(self):
        buckets = self.random_subtrees_bucket # line 242
        # This part is important
        if not self.is_training:
            print("Using random subtrees buckets...........")
            buckets = self.random_subtrees_bucket
        else:
            print("Using all subtrees buckets...........")
        bucket_ids = list(buckets.keys())
        random.shuffle(bucket_ids)
        ......
little-pikachu commented 3 years ago

Another error is in line 181 of file infercode/old_version/utils/data/tree_loader.py.

 batch_obj = {
            "batch_node_indexes": batch_node_indexes,
            "batch_node_types": np.asarray(batch_node_types),
            "batch_node_tokens": np.asarray(batch_node_tokens),
            "batch_node_tokens_text": batch_node_tokens_text,
            "batch_children_indices": np.asarray(batch_children_indices),
            "batch_children_node_types": np.asarray(batch_children_node_types),
            "batch_children_node_tokens": np.asarray(batch_children_node_tokens),
            "batch_token_ids": batch_token_ids,
            "batch_tree_size": batch_tree_size,
            "batch_file_path": batch_file_path,
            "batch_subtree_id": np.reshape("batch_subtree_id", (self.batch_size, 1)) # line 182
        }
bdqnghi commented 3 years ago

i suggest you use the new version, see the latest update in the README file