ananab / bduck

MIT License
0 stars 2 forks source link

Suggested questions #2

Open peter0083 opened 7 years ago

peter0083 commented 7 years ago

Black Duck Data Challenge

University of British Colombia

March 9th 2017

Some helpful questions

Reference: https://en.wikipedia.org/wiki/Six_degrees_of_separation

peter0083 commented 7 years ago

I have been receiving some questions regarding the dataset, prizes, and submission method and I wanted to share answers with you all to clarify. Here are some of the questions and my answers to them:

  1. What does each column in the data set mean?

    d_r_uuid is open source project id dws, dns, and so objects are 3 features of the project (you can think of them as some sort of different features/representation that’s in the open source project id) version is the security version that’s used in the particular project (this is important because you want to make sure which version a project is using to check if the version is under certain security vulnerabilities or not) license_id is id of open source license the project complies under. (the hint is that fewer license a project uses for each project, better it is since there is lower chance of violating that license)

  2. When is the deadline and how to submit?

    Deadline is March 20th Monday noon. Final submission can be sent to me by email or with any sort of attachment you used for your work. We will be expecting your code along with your 5 insights if you coded it out.

  3. What’s the prize for submission and for winning?

    $50 gift card will be given per team if you submit your work and it has at least five insights about the data. Winning teams will receive interview offers, free lunch at our office with other data scientist alongside with free office tour.

  4. Any general tips on the data challenge?

    The key is finding interesting patterns and finding ways to visualize them. I wouldn’t particularly say one algorithm is better than the other since there is no right or wrong answer. I suggest continuing to ask different questions such as “is there any better methods”, “Is this pattern intriguing?” or “How can I do better?”