Look for bad get_current_user scripts

burnettk commented 1 month ago

also puts xml parsing in one place and makes it a little safer. also uses ruff check to avoid deprecation of ruff . deals with comment on #773 about using get_current_user breaking some instances.

Summary by CodeRabbit

New Features
- Introduced a script to identify uses of get_current_user() in BPMN files, enhancing security and compliance checks.
Refactor
- Improved the autofix process by integrating check with --fix in the Makefile and run_pyl script for more comprehensive code quality checks.
- Enhanced XML parsing security and efficiency in the process model testing service by updating method implementations for better file handling and parser configuration.
Chores
- Updated internal scripts and services for better performance and security compliance.

coderabbitai[bot] commented 1 month ago

Walkthrough

## Walkthrough The recent updates enhance code quality and security across several components. A notable change is the integration of `ruff` with the `check` and `--fix` options to improve code linting and autofixing. Additionally, a new script has been introduced to identify specific BPMN script tasks, enhancing the review process for workflow scripts. The handling of XML files has been refined, prioritizing security and efficiency in parsing and referencing within the backend services. ## Changes | File(s) | Summary | |----------------------------------------------------------------|-----------------------------------------------------------------------------------------------------| | Makefile, bin/run_pyl | Unified use of `ruff check --fix` for linting and autofixing across `Makefile` and `run_pyl` script.| | spiffworkflow-backend/.../lint_get_current_user_scripts.py | Added script to identify `get_current_user()` calls in BPMN files using XPath queries. | | spiffworkflow-backend/.../process_model_test_runner_service.py,
spiffworkflow-backend/.../spec_file_service.py | Enhanced XML parsing and file handling for security and efficiency improvements in backend services.|

Recent Review Status

**Configuration used: .coderabbit.yaml**

Commits

Files that changed from the base of the PR and between 293aa867a1cef056c5bee3ef037be31047fdc49e and b3d905e9a4e87fd7ff08d9d94ca19e95a3132a8e.

Files selected for processing (5)

* Makefile (1 hunks) * bin/run_pyl (1 hunks) * spiffworkflow-backend/bin/lint_get_current_user_scripts.py (1 hunks) * spiffworkflow-backend/src/spiffworkflow_backend/services/process_model_test_runner_service.py (2 hunks) * spiffworkflow-backend/src/spiffworkflow_backend/services/spec_file_service.py (1 hunks)

Additional Context Used

Learnings (1)

spiffworkflow-backend/src/spiffworkflow_backend/services/process_model_test_runner_service.py (1)
``` User: jbirddog" PR: sartography/spiff-arena#826 File: spiffworkflow-backend/src/spiffworkflow_backend/services/spec_file_service.py:0-0 Timestamp: 2023-12-20T16:18:04.439Z Learning: The user has removed the commented-out code related to `LookupFileModel` and `LookupDataModel` as suggested. ```

Additional comments not posted (6)

bin/run_pyl (1)
`27-27`: The update to include `check` in the `ruff` command aligns with current best practices.

Makefile (1)
`112-112`: The inclusion of `check` in the `ruff` command within the `Makefile` is a good practice.

spiffworkflow-backend/bin/lint_get_current_user_scripts.py (1)
`1-70`: The script effectively identifies `get_current_user()` usage within BPMN files and utilizes secure XML parsing practices.

spiffworkflow-backend/src/spiffworkflow_backend/services/spec_file_service.py (2)
`59-61`: Using `cls.full_file_path` in `get_references_for_file` enhances consistency in file path handling. --- `69-69`: The `get_etree_from_xml_bytes` method significantly improves XML parsing security by applying several important configurations.

spiffworkflow-backend/src/spiffworkflow_backend/services/process_model_test_runner_service.py (1)
`289-289`: Replacing direct etree parsing with `SpecFileService.get_etree_from_xml_bytes` in `_get_etree_from_bpmn_file` centralizes and secures XML parsing operations.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

---

Tips

### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger a review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

madhurrya commented 1 month ago

@burnettk is this deployed in dev.mod?

burnettk commented 1 month ago

there is nothing to deploy, since the script is run locally. if we want to know if there are bad scripts on a certain branch, i can check.

madhurrya commented 1 month ago

I am not clear KB. Doesn't these files related to this PR needs to be deployed in dev.mod if I need to check if this fixes the old instance issue I have mentioned in #773?

burnettk commented 1 month ago

@madhurrya there is no reasonable way to fix instances that were run with a process model that doesn't conform to these policies. ideally we fix process models so no new bad instances are created, flush out the bad instances until we feel we can manage the rest manually, and then deploy celery.

madhurrya commented 1 month ago

So when we deploy the celery to dev.app/test.app/test.mod/Prod the pp1/pp2/pp3, etc old instances will get this issue? Or are those models already updated, so we will not get this issue with old instances there.

burnettk commented 1 month ago

we could theoretically write a script to figure out if any open process instances could be affected. it would be significantly harder to write than the existing script, and even more difficult on prod, where we don't have access.

madhurrya commented 1 month ago

If we update the existing models accordingly now, then we'll not get this issue with the old instances of those updated models right? Have we already updated those models or not yet? Have you already discussed this with @sashayar13 ?

sashayar13 commented 1 month ago

I haven't updated all process models yet, was planning to do so during the week. But it will not solve the issue - there still will be old open instances in prod since people are doing the expense submission to the previously opened Travel and Purchase requests. We need to have a way to handle such old instances

burnettk commented 1 month ago

ok, first we need to fix the models in prod, and then i see a few options here:

note the time that we updated prod, maybe 30 apr 2024, or whenever that happens. wait a few weeks. do a query for instances created before 30 apr 2024. those instances are bad. if there are 3, talk to the the people who "own" the instances, mitigate the potential impact, and deploy celery. if there are 50, wait a few more weeks. eventually there are few enough instances that we can safely release celery without causing too much damage.
write a script that detects bad instances. this would take about 3 or 4 days. that way we could have a better idea about which of the 50 instances could actually be affected by the get_current_user issue. this is not the greatest type of work because part of it would only be applicable for this one issue, and then we'd throw it away.
try to really understand the nature of the process models that are affected and figure out a way to make old instances work by writing some sort of script. this is likely to be a 5 to 10 day task, most of the value of which wouldn't be carried forward (the learnings and code produced would only be applicable for this one issue, and then we'd throw it away), so i wouldn't recommend it.

sashayar13 commented 1 month ago

@burnettk, we have already updated all process models affected by the change. These process models are in prod from 29.04

I can check how many open instances we have created before this date and let you know. Based on the number and nature we can see which option is acceptable

sashayar13 commented 1 month ago

373 not completed process instances with the start date before 29.04.2024 Next step - analyse the nature of the process instances and plan the next steps accordingly

madhurrya commented 1 month ago

Since we are still testing and bug fixing things in Celery I assume it'll take atleast another month or so to reach it to Prod. By that time wouldn't most of those instances will be closed? Still we'll have to take action for any instances that'll be open.

sashayar13 commented 3 weeks ago

@madhurrya, may I ask you please to inform me a week prior to the change release date?

madhurrya commented 3 weeks ago

@sashayar13 I guess you mean before the Celery release. Yes sure will let you know.

sartography / spiff-arena

Look for bad get_current_user scripts #1370

Summary by CodeRabbit