saltudelft / type4py

Type4Py: Deep Similarity Learning-Based Type Inference for Python
Apache License 2.0
61 stars 13 forks source link

Type4py for Large Datasets #14

Open LangFeng0912 opened 1 year ago

LangFeng0912 commented 1 year ago

Overall: Scripts for processing large datasets have been added. The adding and updating parts include: in the "main.py": add new CLI command: "learn_split", "gen_cluster", "infer_projects" in the "vectorize.py": update the generating datapoints function in batches add "learn_split.py" for training the model separately add "gen_cluster.py" for generating clusters based on the model separately also add new functions for dataset_loading in "data_loadeds.py"