Add relationships types

https://app.blar.io/chat/653

PR Summary

This pull request comprises several refactorings and enhancements aimed at improving code readability, maintainability, and flexibility in the repository. Key updates include:

Refactoring of static methods to include explicit parameter names for clarity.
Amendments to the main function to allow environment variables to be passed as parameters, enhancing flexibility in different environments.
Introduction of a method to abstract project file iteration, improving code organization.
Refactoring to use **kwargs in the FunctionNode class constructor, simplifying code but requiring careful validation.
Integration of the tree_sitter_node parameter in node creation methods to enhance syntax tree analysis.
Refactoring import statements to utilize TYPE_CHECKING, minimizing circular import issues.
Enhancement of the RelationshipCreator class to skip self-referencing nodes.

Key Findings

Potential Bugs:
- NodeFactory: Limited handling of future symbol kinds in the create_node_based_on_kind method could lead to runtime errors. Lack of validation for new tree_sitter_node attributes can introduce issues if not carefully handled.
- TreeSitterHelper: The get_reference_type method is incomplete, potentially misrepresenting non-call relationships.
Bad Practices:
- Use of string literals for type annotations may hinder type checking and IDE functionality (NodeFactory, Relationship).
- Allowing None as parameters in the main function without validation can lead to unhandled exceptions.
- Use of **kwargs without specified keys in FunctionNode can obscure error identification.
- Hardcoded dictionary accesses without validation in TreeSitterHelper can raise KeyError.
Code Clarity:
- Self-reference logic in RelationshipCreator would benefit from more comprehensive docstrings and inline documentation for maintainability.

Overall, while the refactoring and enhancements align with improving codebase structure and flexibility, attention to validation and detailed documentation is crucial to prevent potential runtime errors and maintain development ease.

Commit Reviews

**Commit**: refactor: add parameter names #### Summary: The commit refactored multiple static methods within the `NodeFactory` class by adding explicit parameter names to the constructor calls of various nodes like `FolderNode`, `FileNode`, `ClassNode`, and `FunctionNode`. This change enhances code readability and ensures that the constructor parameters are clearly mapped to the arguments passed, reducing potential issues from parameter misalignment. The commit is marked as mergeable. #### Issues Found: 1. **Type**: Potential Bug **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Node/utils/NodeFactory.py in the class NodeFactory and the method create_node_based_on_kind **Description**: In the `create_node_based_on_kind`, a `ValueError` is raised if the `kind` is neither a Class nor a Function. While this is a controlled error, the method assumes only two kinds are permissible. If other symbol kinds are introduced, this could become a source of runtime errors. **Code Snippet**: ```python else: raise ValueError(f"Kind {kind} is not supported") ``` 2. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Node/utils/NodeFactory.py in the file NodeFactory.py **Description**: Usage of string literals for type annotations (e.g., `"Folder"`, `"TreeSitterNode"`) could impede the benefits provided by type checkers and IDE features. While it's necessary for `TYPE_CHECKING` conditionals to avoid circular imports, it’s preferable to use actual types where possible. **Code Snippet**: ```python from TYPING import TYPE_CHECKING if TYPE_CHECKING: from Files import Folder from tree_sitter import Node as TreeSitterNode ``` No explicit performance or security issues were observed in this commit. The added parameter names do not affect the logic of the methods but improve maintainability and reduce error likelihood from misordered arguments. **Commit**: pass env vars as parameters to main #### Summary: The commit refactors the `main` function in `main.py` to accept the `ROOT_PATH` and `BLARIGNORE_PATH` environment variables as optional parameters. This allows for these environment variables to be passed directly to the function rather than being exclusively loaded via `dotenv`, enhancing flexibility in how the function can be invoked. The `dotenv.load_dotenv()` call is kept within the `__main__` block to initialize environment variables when running the script directly. #### Issues Found: 1. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/main.py in the main function **Description**: The `main` function now allows `root_path` and `blarignore_path` to be passed as parameters, but there is no validation to check if these parameters are `None`. If either of these environment variables is missing or incorrectly named in the `.env` file, `None` could be passed to constructors or functions that may not handle `None` gracefully. **Code Snippet**: ```python def main(root_path: str = None, blarignore_path: str = None): lsp_caller = LspCaller(root_uri=root_path) #... ``` No explicit performance or security issues were detected within this particular commit. The adjustment enhances code modularity by allowing these variables to be supplied as arguments, which could simplify testing and deployment in different environments. **Commit**: refactor: abstract project iteration #### Summary: The commit refactors the `ProjectGraphCreator.py` file by abstracting the project file iteration logic into a separate method called `create_code_hierarchy`. This change improves code organization and readability, allowing the build process to be broken into more distinct steps. Additionally, the commit modifies the parameters of a method to include `tree_sitter_helper`, which enhances its functionality by providing a more complete context for operations involving node relationships. #### Issues Found: 1. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/ProjectGraphCreator.py in the method create_code_hierarchy **Description**: The method `create_code_hierarchy` is invoked within `build()`, but the relationship between folders and files is directly processed within `process_folder`, which might lead to redundancy or lack of clarity regarding the sequence of operations especially when further modifications are conducted. **Code Snippet**: ```python def create_code_hierarchy(self): for folder in self.project_files_iterator: self.process_folder(folder) ``` 2. **Type**: Potential Bug **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/ProjectGraphCreator.py in the method create_node_relationships **Description**: The `create_node_relationships` method, now using `tree_sitter_helper` as an additional parameter, lacks any validation or error handling around its usage. This could lead to unexpected behavior if `tree_sitter_helper` is not correctly initialized or used within this context. **Code Snippet**: ```python relationships = RelationshipCreator.create_relationships_from_paths_where_node_is_referenced( references=references, node=node, graph=self.graph, tree_sitter_helper=self.tree_sitter_helper ) ``` No direct performance or security issues were identified in this refactoring process, but the inclusion of additional parameters might necessitate further validation checks to prevent runtime errors. The abstraction aims to enhance code clarity, although it must ensure consistency in how relationships and nodes are processed. **Commit**: refactor: use kwargs #### Summary: The commit refactors the `FunctionNode` class by altering the `__init__` method to use `**kwargs` instead of explicitly listing all individual parameters. This refactor simplifies the constructor by allowing all arguments to be passed as keyword arguments and reducing boilerplate code. This method then forwards these `kwargs` to the superclass constructor after explicitly setting `label=NodeLabels.FUNCTION`. #### Issues Found: 1. **Type**: Potential Bug **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Node/FunctionNode.py in the class FunctionNode and the method __init__ **Description**: The change to accept `**kwargs` without specifying expected keys introduces the risk of runtime errors due to key mismatches. If required keys are missing in `kwargs` or incorrect keys are inadvertently passed, this can locally break the functionality where the object is instantiated. **Code Snippet**: ```python def __init__(self, **kwargs): super().__init__(label=NodeLabels.FUNCTION, **kwargs) ``` 2. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Node/FunctionNode.py in the class FunctionNode and the method __init__ **Description**: Using `**kwargs` can obscure code readability and operation for other developers unfamiliar with the expected arguments, leading to maintenance challenges. **Code Snippet**: ```python def __init__(self, **kwargs): super().__init__(label=NodeLabels.FUNCTION, **kwargs) ``` No performance or security issues were detected inherently from this refactor; however, it places an implicit requirement on ensuring that `kwargs` are properly populated and validated before object construction. This can potentially lead to challenges in debugging and further development if changes are made without careful consideration of affected parameters. **Commit**: add tree-sitter node as DefineNode attribute #### Summary: The commit refactors the `NodeFactory` class to include a `tree_sitter_node` parameter in its node creation methods: `create_file_node`, `create_class_node`, `create_function_node`, and `create_node_based_on_kind`. This parameter addition to the node creation functions suggests an enhancement to utilize Tree-sitter nodes while constructing these nodes, presumably for more detailed syntax tree analysis. #### Issues Found: 1. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Node/utils/NodeFactory.py in the class NodeFactory and the methods create_file_node, create_class_node, create_function_node **Description**: The addition of `tree_sitter_node` as a parameter means this attribute is now required whenever creating these nodes, yet the context or required format for `tree_sitter_node` isn't documented. Developers must understand this attribute's expected usage and constraints to avoid runtime issues which could arise if it is improperly initialized. **Code Snippet**: ```python def create_file_node( ..., tree_sitter_node: "TreeSitterNode" ) -> FileNode: ``` 2. **Type**: Potential Bug **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Node/utils/NodeFactory.py in the class NodeFactory and the method create_node_based_on_kind **Description**: The method `create_node_based_on_kind` now depends on `tree_sitter_node` regardless of whether it's a `ClassNode` or `FunctionNode`. If this parameter isn't carefully managed or appropriately passed in all contexts, this could cause runtime exceptions due to missing or incompatible `tree_sitter_node` attributes. **Code Snippet**: ```python def create_node_based_on_kind( ..., tree_sitter_node: "TreeSitterNode" ) -> Union[ClassNode, FunctionNode]: ... ``` No explicit performance or security issues were noted within this commit context, but ensuring input validation, especially for the newly introduced `tree_sitter_node`, would bolster stability and prevent potential misuse. **Commit**: refactor: import typing #### Summary: The commit modifies the `Relationship` class by refactoring the import statements to use `TYPE_CHECKING` from the `typing` module. This change is aimed at improving the manageability of imports, particularly to avoid circular imports and potentially unnecessary declarations in runtime by using forward declarations for type hints (`"Node"`, `"RelationshipType"`). #### Issues Found: 1. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Relationship/Relationship.py in the class Relationship **Description**: While the use of `TYPE_CHECKING` helps avoid circular dependencies, the reliance on string literals for type hints (`"Node"`, `"RelationshipType"`) might obscure type checking and autocomplete functionality in some IDEs. It's essential to ensure that the runtime environment and linters are configured to recognize these forward references correctly. **Code Snippet**: ```python start_node: "Node" end_node: "Node" rel_type: "RelationshipType" ``` No immediate performance or security issues were observed from this refactoring. The changes improve structural integrity when the code is executed, particularly when dealing with complex module dependencies. However, developers should ensure their development tools are compatible with string-based type hints to maintain productivity. **Commit**: handle relationships types #### Summary: The commit enhances the `TreeSitterHelper` class by integrating functionality for handling relationships types, specifically adding the `get_reference_type` method. This method is designed to determine the relationship type between nodes based on tree-sitter's syntax tree nodes. Additionally, the changes refine type annotations using `TYPE_CHECKING` and include adjustments for incorporating `tree_sitter_node` in node creation functions. #### Issues Found: 1. **Type**: Potential Bug **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/TreeSitter/TreeSitterHelper.py in the class TreeSitterHelper and the method get_reference_type **Description**: The `get_reference_type` method currently defaults to returning `RelationshipType.CALLS` and includes commented-out code for determining specific relationship types based on syntax tree traversal. This placeholder may cause incorrect relationship type handling until fully implemented. **Code Snippet**: ```python # while named_parent.grammar_name not in rel_types: # named_parent = named_parent.parent # return rel_types[named_parent.grammar_name] return RelationshipType.CALLS ``` 2. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/TreeSitter/TreeSitterHelper.py in the class TreeSitterHelper and the method get_reference_type **Description**: The function `get_reference_type` uses hardcoded dictionary field accesses without verifying if the keys exist. This can lead to `KeyError` if the `reference` dictionary doesn't contain expected keys. **Code Snippet**: ```python start_line = reference["range"]["start"]["line"] start_char = reference["range"]["start"]["character"] end_line = reference["range"]["end"]["line"] end_char = reference["range"]["end"]["character"] ``` Such issues highlight the importance of robust code completion, particularly when dealing with tree structures and inferred relationships where clear type mapping and handling are critical to functionality and avoiding exceptions. **Commit**: refactor: simplify _get_text_for_skeleton method and remove unused get_function_call_query #### Summary: The commit refactors the `PythonDefinitions` class by simplifying its code and removing the unused `get_function_call_query` method. Additionally, it introduces the `get_relationships_group_types` method, which maps relationship types to corresponding `RelationshipType` enums, facilitating the handling of different relationships such as calls, imports, inheritance, and instantiation. #### Issues Found: 1. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/TreeSitter/Languages/PythonDefinitions.py in the class PythonDefinitions and the method get_relationships_group_types **Description**: The `get_relationships_group_types` method returns a static dictionary mapping string keys to `RelationshipType` values. If additional relationship types are introduced or if modifications are made elsewhere, without corresponding updates here, this could lead to incomplete or incorrect relationship mapping. **Code Snippet**: ```python def get_relationships_group_types() -> dict[str, RelationshipType]: return { "function_call": RelationshipType.CALLS, "import": RelationshipType.IMPORTS, "inheritance": RelationshipType.INHERITS, "instantiation": RelationshipType.INSTANTIATES, } ``` No significant performance or security issues were detected. The refactor aids in transparency by pruning unused methods, although care should be taken to ensure that the static mapping remains current with the evolving relationship handling logic. **Commit**: Merge branch 'add-relationships-types' of github.com:blarApp/lsp-poc into dev #### Summary: The commit merges changes from a branch titled `add-relationships-types` into `dev`, particularly affecting the `main.py` file. This update refactors the `main` function to accept `root_path` and `blarignore_path` as optional parameters. The changes remove the direct loading of environment variables within the function itself, instead handling them in the script initialization block. This enhances flexibility and clarity in function utilization. #### Issues Found: 1. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/main.py in the main function **Description**: Converting environment variables into optional function parameters allows for greater flexibility in function calls. However, there is no validation to ensure these parameters are appropriately set, which could lead to issues if `None` values are passed to dependent systems or functions without guards against `None` values. **Code Snippet**: ```python def main(root_path: str = None, blarignore_path: str = None): lsp_caller = LspCaller(root_uri=root_path) #... ``` No explicit security or performance issues were observed from the code changes. The introduced changes maintain existing functionality while modularizing environment setup. **Commit**: fix: merge conflicts #### Summary: The commit addresses and resolves merge conflicts in the `TreeSitterHelper.py` file. The fixes focus on integrating `RelationshipType` logic into the `TreeSitterHelper` class, enhancing its capability to classify relationships between nodes. Additionally, it includes adjustments in method signatures to incorporate the `tree_sitter_node` parameter and some type hint improvements using forward references. #### Issues Found: 1. **Type**: Potential Bug **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/TreeSitter/TreeSitterHelper.py in the class TreeSitterHelper and the method get_reference_type **Description**: The `get_reference_type` method currently defaults to returning `RelationshipType.CALLS` without fully implementing the logic intended to derive the type based on the node's named parent. The incomplete traversal logic remains commented out, leaving the function's behavior potentially incorrect for non-call relationships. **Code Snippet**: ```python # while named_parent.grammar_name not in rel_types: # named_parent = named_parent.parent # return rel_types[named_parent.grammar_name] return RelationshipType.CALLS ``` 2. **Type**: Bad Practice **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/TreeSitter/TreeSitterHelper.py in the class TreeSitterHelper and the method get_reference_type **Description**: Direct access to dictionary keys (`reference["range"]["start"]["line"]`) without validation or default setting may lead to runtime errors like `KeyError` if the expected keys are absent or malformed. **Code Snippet**: ```python start_line = reference["range"]["start"]["line"] start_char = reference["range"]["start"]["character"] ``` No immediate security, performance issues, or new functionalities were introduced beyond resolving the merge conflicts and refocusing the code to support relationship type handling. These changes emphasize completeness and robustness in implementing the commented-out logic for accurate relationship determination. **Commit**: skip reference to itself #### Summary: The recent commit introduces logic in the `RelationshipCreator` class to handle and skip references where a node references itself. Specifically, during the creation of relationships from paths where a node is referenced, the procedure now checks if the referenced node is the same as the starting node and skips it if true. This prevents creating redundant self-referencing relationships. The commit also integrates `TreeSitterHelper` to determine the specific relationship type between nodes. #### Issues Found: 1. **Type**: Code Clarity **Scope**: temp/repos/main/blarApp_lsp-poc/lsp-poc/Graph/Relationship/RelationshipCreator.py in the class RelationshipCreator and the method create_relationships_from_paths_where_node_is_referenced **Description**: The function has an inline comment indicating logic for self-reference skipping, but further clarity can be ensured with more explicit documentation in terms of docstrings for better maintainability especially if more complex relationship determinations are expected to be supported. **Code Snippet**: ```python if reference == node.path: continue ... if node_referenced is None or node.id == node_referenced.id: continue ``` No functional or security issues were evident in this commit; the refactor simplifies relationship logic by preventing unnecessary self-referencing which can clutter relationship data without adding value.

blarApp / lsp-poc

Add relationships types #9

PR Summary

Key Findings