smartshark / smartshark.github.io

Project Homepage
https://smartshark.github.io
Apache License 2.0
2 stars 1 forks source link

Question about the pycoSHARK API #23

Closed shehan closed 3 years ago

shehan commented 3 years ago

Hi,

I hope this is the correct place to ask my question. The SmartSHARK dataset looks really interesting, and I'm interested in using it for the MSR 2022 mining challenge. I followed the instructions in your usage-examples repository and successfully restored the SmartShark database, and executed the Jupyter notebook. My question is about using the pycoshark API's.

  1. I did not find the existing pycoSHARK API documentation helpful in determining how to use the various models. Do you have any other artifacts I could refer to?
  2. Some specific questions I have: a. How do I get a list of all projects? The usage example only shows how to get a specific project b. Is there a way to get all instances of a specific model (such as Refactoring, Issue) without first retrieving the project. For example, I first want to get all move method refactorings and then obtain the projects that are associated with these refactorings.

I hope these questions make sense :)

Thanks again for making the dataset available to the community!

atrautsch commented 3 years ago

Hi,

thank you for your interest in our work!

There is the API documentation linked in the Pycoshark readme here. Other than that you can look at the mongomodels.py.

Aside from utility functions, Pycoshark just defines what fields are in which document of which collection to provide a common database schema. Pycoshark and most of our code uses Mongoengine, the Mongoengine documentation will probably help you more with common use cases such as listing all documents in a collection or query construction.

To fetch all projects: Project.objects.all()

You can use the same method to get all documents from the rest of the collections, e.g., Issue.objects.all(). However, the data returned gets very large very quick without at least restricting to a project.

Hope this helps!

Have fun with the data!

shehan commented 3 years ago

Thank you for the reply. This is my first time using MongoDB, so there is a bit of a learning curve for me :) I've been playing around with the mongoengine API in python and I'm figuring out how to get things done!

Thanks again!