Open tomsmith8 opened 2 months ago
Bounty posted: https://community.sphinx.chat/bounty/2442
Since the codebase is in Go and requires parsing Go code, it's most efficient to use Go for the parsing logic. However, since AWS Lambda supports custom runtimes and Docker images, we can package the Go application in a Docker container for Lambda deployment.
If anyone has experience with this, let me know
Task Overview
Codebase:
https://github.com/stakwork/sphinx-tribes
Language:go
Develop a tool or script that can be packaged up as an AWS Lambda function that analyzes an entire Go (go) codebase to extract and represent its structural elements into json.
The tool should parse the code using Abstract Syntax Trees (ASTs) and generate nodes and edges in a predefined JSON format. The goal is to accurately extract various components of the codebase, including:
(Include any additional elements that ASTs can accurately extract and are valuable for understanding the codebase)
Strategy
Utilize Go's standard library packages (go/parser, go/ast, go/token) for parsing the codebase. OR use
tree-sitter
Iterate over all relevant .go files in the codebase. Parse each file into an AST for detailed analysis.
Traverse the ASTs to identify and extract the required node types. Capture details such as names, parameters, return types, variable declarations, and type definitions.
Define nodes for each extracted element. Establish edges representing relationships between nodes (e.g., function calls, method receivers, type embeddings).
Use a predefined JSON structure for consistency.
Generating the Knowledge Graph
Compile the nodes and edges into a coherent knowledge graph. Output the graph in the specified JSON format.
Verify the accuracy of the extracted data. Ensure that the knowledge graph correctly represents the codebase structure.
Nodes and Edges Structure (JSON)
Node Structure
Each node will have the following structure:
Edge Structure
Each edge will have the following structure:
Example Nodes
Node
Example Edges
Function- > Calls -> Function
Required Output
nodes
andedges
array, structured as per the defined schemas.Deliverables
Acceptance Criteria