rpytel1 / log-strategy

Project conducted for Seminar in Machine Learning for Software Engineering. Aim of our research was to explore possible directions of Deep Learning solutions for log detection in a snippet of code.
0 stars 1 forks source link

Labeled code vectors pre trained code2vec (not issue but link share) #13

Open CasperSchroder opened 4 years ago

CasperSchroder commented 4 years ago

G drive link:

https://drive.google.com/open?id=1G0JtgelCNjjIHiGolUpF-4DbIrGyUdxO

jan-gerling commented 4 years ago

Latest version: https://drive.google.com/file/d/1ah6_wNtE_7yFHtFfB-WZSos3KsZDgIEr/view?usp=sharing

Dekelv commented 4 years ago

small preprocced dataset: https://drive.google.com/drive/folders/19YT1Od8ME-q-48kfd-79LOIJHcj1ZM_b?usp=sharing

jan-gerling commented 4 years ago

220.000 codevectors, names and labels, balanced with 20% positive and shuffled: https://drive.google.com/file/d/1xd24wG7lpcw1SF8SwZAtshEl1W6jZsu8/view?usp=sharing

Dekelv commented 4 years ago

https://drive.google.com/drive/folders/180Lqpmp4X4YPwbAf_94uBiuj0B8pSM8D?usp=sharing complete preprocessed data train is 24k positive and 96k negative

jan-gerling commented 4 years ago

Test set with 104K methods, 3k positive, shuffled, from elastic search: https://drive.google.com/file/d/1u1thXBhpmdaANGtSp4daSeEpe-tfG8S2/view?usp=sharing

jan-gerling commented 4 years ago

test data set with 20K methods from apache projects: https://drive.google.com/file/d/1NTLQyrDPUaivfvzCLWmhloyT_IQxmUp1/view?usp=sharing

jan-gerling commented 4 years ago

codevectors_labeled_shuffled_test02 :

codevectors_labeled_shuffled_test :

codevectors_labeled_rebalanced-0-2_shuffled :