Closed JoaoFelipe closed 8 months ago
@JoaoFelipe, thank you for applying to the internship project program.
We're pleased to inform you that your application is complete and in excellent condition. Our team will review it and get back to you shortly with further details.
We appreciate your interest in joining our program. All the best.
noworkflow is already listed:
thanks @JoaoFelipe
Organization/Project Name
noWorkflow
Organization/Project Webpage URL
https://gems-uff.github.io/noworkflow/
Organization/Project Introduction
The noWorkflow project aims at allowing scientists to benefit from provenance data analysis even when they don't use a workflow system. It transparently collects provenance from Python scripts and notebooks and provide tools to support the analysis and management of the provenance.
Organization/Project Summary
noWorkflow was developed in Python and it currently is able to capture provenance of Python scripts using Software Engineering techniques such as abstract syntax tree (AST) analysis, reflection, and profiling, to collect provenance without the need of a version control system or any other environment. Capturing the provenance of a script is as simple as running
$ now run script.py
instead of$ python script.py
.After capturing the provenance, users can visualize how was the execution of the script, making sense of the sequence of activations (function calls), the dependencies among variables, the used libraries, and intermediate files generated during the execution. Users can also restore previous versions of the script and input data and track the execution history. Hence, it also has the goal of allowing users to avoid using naming conventions to store files originated in previous executions.
Organization/Project Structure
Target audience
noWorkflow targets mainly at supporting scientists at handling their data analysis in Python. However, it also has applications in general software development. For instance, Linhares et al (2019) proposed a tool that consumes noWorkflow provenance to support the debugging of Python scripts
Project internal structure
The main Python package is the
capture
folder. It has the installation script (setup.py
) and thenoworkflow
python module. Inside the noworkflow package, thenow
sub-module has the main sub-modules related to command line (cmd
), provenance collection (collection
), database access (persistence
), and backend of the web tool that manages the provenance (vis
). The frontend of the tool is in the root of the project under thenpm
folder.Team
The main noWorkflow team is composed by researchers from Universidade Federal Fluminense (UFF) in Brazil and New York University (NYU), in the USA.
Former Collaborators
Applicant's Full Name
João Felipe Nicolaci Pimentel
Applicant's Email
joaofelipenp@gmail.com
Code of Conduct Agreement
Project Ideas Document Link
https://gist.github.com/JoaoFelipe/ce4cb232deb2c71d4f39afc5cbeefe2b