Jur1cek / gcj-dataset

Collected solutions from Google Code Jam programming competition (2008-2020).
Other
58 stars 9 forks source link

How do I get to know which of the solutions are accepted (correct)? #3

Open Akash-Sharma-1 opened 3 years ago

Akash-Sharma-1 commented 3 years ago

Does the dataset only contain submissions which are fully accepted ? If not, is there a way to check which submissions are fully accepted?

Jur1cek commented 3 years ago

Unfortunately dataset does not contain this information (maybe in future I will add this feature).

You can check manually if submisson was accepted in links bellow:

https://www.go-hero.net/jam/17 - prior to 2017 (included). https://codingcompetitions.withgoogle.com/codejam/archive - after 2017.

skdebray commented 6 months ago

I am trying to use the GCJ datasets to do a replication study on the following paper on binary-level code stylometry:

Aylin Caliskan, Fabian Yamaguchi, Edwin Dauber, Richard Harang, Konrad Rieck, Rachel Greenstadt, and Arvind Narayanan. 2018. When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries. In Proceedings 2018 Network and Distributed System Security Symposium (NDSS 2018).

Like this paper, we limit ourselves to C++ submissions. To do the analysis we need to compile the submissions to executables, and unfortunately many of the submissions need additional compiler flags to compile correctly (we get compiler errors when we simply invoke g++ on the source files). Do you have versions of the code that have make files or any other information that would help us compile them?

Many thanks, Saumya Debray