chaoss / augur

Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/ and learn more about Augur at our website https://augurlabs.io
https://oss-augur.readthedocs.io/en/main/
MIT License
586 stars 845 forks source link

Google Summer of Code & Outreachy Project Questions #545

Closed sgoggins closed 4 years ago

sgoggins commented 4 years ago

Google Summer of Code and Outreachy: Augur, 2020

Idea: Machine Learning for Anomaly Detection in Open Source Communities

Micro-tasks and place for questions

Augur is an open source platform that systematically integrates data from several open source repositories, issue trackers, mailing lists, and other communication systems that open source projects rely on to create a highly structured (relational and graph databases), consistent, and validated collection of open source health and sustainability data. Hundreds of highly specialized data requests are implemented in Augur's API, data and visualizations are pushed to Augur users, and the results of one user request benefits the whole community.

The volume of activity across all dimensions of open source makes the identification of significant changes both labor intensive and impractical. By connecting Augur's "insight worker" to its "push notification" architecture, and related pages that allow exploration of identified anomalies, open source companies, community managers, and contributors will be in a better position to identify community or technology issues quickly.

The aims of the project are as follows:

Idea: Implementation of GitLab Data Collection Workers

Micro-tasks and place for questions

Augur is an open source platform that systematically integrates data from several open source repositories, issue trackers, mailing lists, and other communication systems that open source projects rely on to create a highly structured (relational and graph databases), consistent, and validated collection of open source health and sustainability data. Hundreds of highly specialized data requests are implemented in Augur's API, data and visualizations are pushed to Augur users, and the results of one user request benefits the whole community.

One of Augur's greatest strengths is its highly structured and unified ecosystem data model. This data drives all of the metrics and visualizations that are provided, and is of vital importance to the people maintaining open source projects. Of course, that data has to be gathered somehow, which is where the data collection workers come in. Each worker is responsible for gathering, transforming, and storing data related to a particular project from a particular data source. Building a GitLab data collection worker will enable Augur to collect data about commits, issues, contributors, and PRs from a large number of open source projects that live on GitLab.

The aims of the project are as follows:

Difficulty: Medium

Idea: (Blockchain) : Open Source Health and Sustainability SSO Implementation with Hyperledger/Indy and OAUTH

Micro-tasks and place for questions

Augur is an open source platform that systematically integrates data from several open source repositories, issue trackers, mailing lists, and other communication systems that open source projects rely on to create a highly structured (relational and graph databases), consistent, and validated collection of open source health and sustainability data. Hundreds of highly specialized data requests are implemented in Augur's API, data and visualizations are pushed to Augur users, and the results of one user request benefits the whole community.

As the size and scope of projects with rich analytical data grows, the need to protect the privacy and anonymity of individuals working in open source software is a rising concern. Implementation of a block chain technology for single sign on (SSO) for different collections of data is one mechanism for enabling comparisons, analysis and typologies for open source projects, making these growing, rich data sets more useful for developers, community managers, open source program officers, industry leaders and other stakeholders. This project promises close collaboration with individuals in open source journalism, open data efforts, and others with an interest in protecting individual privacy rights. Its also a unique and exciting path to work with blockchain technology on a team focused on its use for SSO.

The aims of the project are as follows:

Difficulty: Medium

cbadjatya commented 4 years ago

I would like to contribute to the project "Open Source Health and Sustainability SSO Implementation with Hyperledger/Indy and OAUTH". What is the micro-task for this project?

PiyushSharma99 commented 4 years ago

I would like to contribute to project "Open Source Health and Sustainability SSO Implementation with Hyperledger/Indy and OAUTH" . Please assign me the micro task for this project.

ankitkumarsamota121 commented 4 years ago

I am interested in contributing to the project "Machine Learning for Anomaly Detection in Open Source Communities". How do I get started with micro-tasks?

saphal1998 commented 4 years ago

I find the project 'Implementation of GitLab Data Collection Workers' interesting enough. I've had some prior experience working with data scraping, so I think this should be something I can quickly get started with. I would really appreciate it if you could point me to a beginner issue for this project. @sgoggins @germonprez

kartik1000 commented 4 years ago

Hello, I am interested in 'Implementation of GitLab Data Collection Workers' . I would really appreciate if anybody can help me in getting started with this or point out to a beginner issue for the project.

chinmay81098 commented 4 years ago

I would like to contribute to project "Machine Learning for Anomaly Detection in Open Source Communities". What are the initial tasks for this project?

KritikGarg1 commented 4 years ago

I would like to contribute to the project "Open Source Health and Sustainability SSO Implementation with Hyperledger/Indy and OAUTH". I have prior experience in creating blockchain using python and deploying using flask. Please assign me the initial task. @sgoggins @germonprez

mhash1m commented 4 years ago

Hi, I'd like to contribute to the project : "Machine Learning for Anomaly Detection in Open Source Communities". Waiting on the micro-tasks and questions for the project :)

siddharthjain1611 commented 4 years ago

Hi, myself Siddharth Jain I'd like to contribute to the project : "Machine Learning for Anomaly Detection in Open Source Communities". I have prior experience working with machine learning with python. Waiting on the micro-tasks and questions for the project :)

KIRA009 commented 4 years ago

Hello, I am interested in 'Implementation of GitLab Data Collection Workers'

humblefool1997 commented 4 years ago

Hi, I am interested in Implementation of GitLab Data Collection Workers, can you suggest on the micro-tasks and questions for the project.

mrsaicharan1 commented 4 years ago

Hi, I'm Saicharan and I'd like to contribute to the project : "Implementation of GitLab Data Collection Workers". Would appreciate it if you could assign a micro-task to me. Thanks

germonprez commented 4 years ago

@sgoggins do you have the microtasks for these?

parthsharma2 commented 4 years ago

Really excited to see so many people interested in these GSoC Projects. For everyone interested in these projects I suggest setting up Augur locally as the first microtask. Go through the documentation to set up Augur locally and get a basic understanding of Augur's architecture, how it works, etc. There might be certain parts in the documentation that are incomplete or missing or need improvement, so feel free to ask questions here and help us improve the documentation (feel free to submit PRs :smiley: )

For people interested in implementing the "GitLab Data Collection Workers", I'd suggest after setting up a local instance of Augur, try and run a few workers to collect data and try and understand how they are working. Also, take a look at the Augur's unified database schema to get a sense of all the data different workers collect and also take a look at the implementation of different workers currently available to get a sense of how they work and are implemented.

aksh555 commented 4 years ago

Hi, I'm Akshara, interested in working with machine learning and would like to contribute to the project "Machine Learning for Anomaly Detection in Open Source Communities". Awaiting the micro-tasks and questions for this project.

pratikmishra356 commented 4 years ago

Hi, I am Pratik Mishra a Machine Learning Enthusiast.I found project named "Machine Learning for Anomaly Detection in Open Source Communities" interesting.Looking forward to contribute in this field.

sgoggins commented 4 years ago

Microtasks are coming! Had Norovirus and then a trip. Within 18 hours!

bnitin92 commented 4 years ago

Hi, I am Nitin Bhandari, interested in working on the project "Machine Learning for Anomaly Detection". I would like to know more about how to begin and start contributing to this project. Awaiting for further instructions and microtasks.

mrsaicharan1 commented 4 years ago

@sgoggins @parthsharma2 @Nebrethar I'm done with my first microtask and I've created a repo for it. https://github.com/mrsaicharan1/chaoss-microtasks/.

Would appreciate it if you could assign some more tasks related to metrics and data collection :D

sgoggins commented 4 years ago

Microtask ideas for SSO now posted in link. @KritikGarg1 @PiyushSharma99 @Chinmay4400

sgoggins commented 4 years ago

Microtask ideas for Machine learning now posted in link of this issue @bnitin92 @pratikmishra356 @aksh555 @siddharthjain1611 @mHash1m @chinmay81098 @ankitkumarsamota121

sgoggins commented 4 years ago

The microtask for the GitLab worker is now posted in the link of this issue: @mrsaicharan1 @isaeef @KIRA009 @kartik1000 @saphal1998

mrsaicharan1 commented 4 years ago

The microtask for the GitLab worker is now posted in the link of this issue: @mrsaicharan1 @isaeef @KIRA009 @kartik1000 @saphal1998

I've completed my microtasks. Additionally, I've also made a pull request for a new metric addition. I would really appreciate it if you could review it. Thanks! https://github.com/chaoss/augur/pull/556 https://github.com/chaoss/augur/pull/560

jessicadong101 commented 4 years ago

Hello! I'm Jessica Dong, and I'm interested in the project "Implementation of GitLab Data Collection Workers." Looking forward to contributing and getting to know more about the project!

dnyanai commented 4 years ago

Hi, I am interested in contributing to the project 'Machine Learning for Anomaly Detection in Open Source Communities'. Looking forward to getting to know more about the project!

vvijayalakshmi21 commented 4 years ago

Hi, I am an Outreachy applicant interested in contributing to 'Machine Learning for Anomaly Detection in Open Source Communities' project. Kindly assign me a task to get started.

aakankshadhurandhar commented 4 years ago

Hello, I am an Outreachy applicant. I am interested in contributing to 'Machine Learning for Anomaly Detection in Open Source Communities' project'.I have prior experience working with machine learning with python. Awaiting for further microtask and instructions @germonprez @sgoggins

Mightynasty commented 4 years ago

Hi , I got into the Outreachy program would like to contribute to the Anomaly detection project, willing to learn some more about machine learning and practice Python skills. Looking forward to something to start with!

janvi04 commented 4 years ago

Hi! I am a Outreachy applicant. I am interested in working on ' Machine Learning for Anomaly Detection in Open Source Communities. I have experience in working with machine learning algorithms using python and scikitlearn. Looking forward to get started with tasks. @germonprez @sgoggins

dnabanita7 commented 4 years ago

Microtask ideas for Machine learning now posted in link of this issue @bnitin92 @pratikmishra356 @aksh555 @siddharthjain1611 @mHash1m @chinmay81098 @ankitkumarsamota121

Hello! I am a ML/DL enthusiast. I am interested in working on the project "Anomoly detection" Can I get started with microtasks?

rishilss99 commented 4 years ago

Hey everyone, I am Rishil.

I am an undergraduate student in electrical engineering with a background in applied machine learning. I have intermediate experience using machine learning and visualization libraries in Python and hope to expand upon my skills through a collaborative rather than a competitive based learning approach.

I am a first-time Outreachy applicant and am both intimidated and very excited to join. I hope to learn and make meaningful contributions to the project on Machine Learning for Anomaly Detection in Open Source Communities

Thank you

germonprez commented 4 years ago

Hi everyone,

If you are interested is this project for either GSoC or Outreachy, please get started on the microtasks as mentioned here: https://github.com/chaoss/augur/issues/558

puneet29 commented 4 years ago

Hi, I'm Puneet, and I have relevant knowledge in machine learning and flask as required. I am interested in working on the project "Machine Learning for Anomaly Detection in Open Source Communities". Looking forward to contributing to this project. Thanks.

KPoornima commented 4 years ago

Hi! My name is Kadukuntla Poornima and I am a B. Tech Computer Science student at the Indian Institute of Technology, Bhubaneswar. I am an Outreachy 2020 applicant and am looking forward to contribute to the project 'Machine Learning for Anomaly Detection in Open Source Communities'. I have relevant prior experience with Machine Learning, Deep Learning, and Scikitlearn and am eager to learn more and dive in!

Jigyasa-Kumari commented 4 years ago

Hi, I am Jigyasa, and I am a student at the Indian Institute of Technology, Roorkee. I am an Outreachy 2020 applicant, and I would be interested in contributing to 'Machine Learning for Anomaly Detection in Open Source Communities'. I would love to get started!

namratavalecha commented 4 years ago

Hello, I am Namrata Valecha and I'm a GSoc Aspirant from Delhi, India. I am currently pursuing B.tech in Computer Science and Engineering from Guru Gobind Singh Indraprastha University. I have an experience in Python / Django wed development and have completed a couple of internships in the same. I also have a good knowledge of Frontend development, SQL and NoSQL databases, Linux environment, Git, Docker, and Flask backend with RESTful APIs, Third-party authentications and Payment Gateway integrations. I went through Chaoss's project ideas and found Implementation of GitLab Data Collection Workers project interesting. Looking forward to contributing to this project. Thanks.

abhhii commented 4 years ago

Hi, I am Abhishek, currently pursuing B.Tech in Computer Science and Engineering from Indian Institute of Information Technology Trichy. I have experience with backend development using node.js and I am willing to learn development using flask. I am also comfortable with frontend development and Sql and noSQL databases. I have experience with machine learning using scikit learn and deep learning using fastai and pytorch. I am interested in Machine Learning for Anomaly Detection in Open Source Communities project and look forward to contributing to it.

ccarterlandis commented 4 years ago

@puneet29 @KPoornima @Jigyasa-Kumari @namratavalecha @abhhii hello all and welcome to Augur! We are excited you're interested in helping us and we would love to help you make your first contribution to our codebase. Please see the relevant issue links below for detailed information and first steps for our microtasks:

Machine Learning: #558 Single-Sign-On (SSO): #557 GitLab Worker Implementation: #559

We look forward to seeing what you do!!

Aparna-Sakshi commented 4 years ago

Hi, I am Aparna Sakshi, currently pursuing Mathematics and Computing at Indian Institute of Technology, Kharagpur. I am really interested to contribute to project Machine Learning for Anomaly Detection.

sanusoumya17 commented 4 years ago

Hello, I am Sanu Soumya, pursuing Mathematics & Computer Science at Miranda House, DU. I am an Outreachy 2020 applicant and am looking forward to contribute to the project 'Machine Learning for Anomaly Detection in Open Source Communities'. I have relevant prior experience with Machine Learning and Python. Excited to learn and to contribute!

ccarterlandis commented 4 years ago

Hello everyone! I have just finished setting up our Slack workspace for applicants, and am happy to invite anyone who is applying for Augur for GSoC 2020. If you would like to be added to the channel, please send me an at c@carterlandis.com with your name and the email address you like me to send the invite to. If you already got an invite from me, you can use it, or if you like I can send it to a different email address. 😊

snehal199 commented 4 years ago

Hi,I am Snehal and I am an outreachy applicant.I am looking forward to contributing to the project "Machine Learning for Anomaly Detection in Open Source Communities".I've spent some time learning more about Chaoss and Augur.Really excited to start contributing to the project !

ccarterlandis commented 4 years ago

GSoC submissions are over and applicants have been selected. Thank you to everyone who submitted!