GoogleChromeLabs / ps-analysis-tool

Privacy Sandbox Analysis Chrome Extension and CLI for analysis and understanding of cookie usage on web pages, and new privacy-preserving Chrome APIs
https://www.privacysandbox.com
Apache License 2.0
100 stars 23 forks source link

Enhance Bulk URL Analysis Error Handling #817

Open tsunoyu opened 2 months ago

tsunoyu commented 2 months ago

Feature Request: Enhance Bulk URL Analysis Error Handling

Description:

Improve the bulk URL analysis tool's error handling to ensure it continues processing even when individual URLs encounter errors. Provide detailed error reporting, including the specific URL that triggered the error, to allow users to investigate further.

Motivation:

The current behavior of the tool, halting analysis and report generation entirely when an error occurs, hinders productivity. Users need to manually identify the problematic URLs and re-run the analysis, which is time-consuming. This enhancement will improve the tool's robustness and usability.

User Story:

When analyzing 1000+ URLs in bulk I want the tool to gracefully handle errors and continue processing the remaining URLs so that it can complete the full analysis and get a comprehensive report even if some URLs fail. I also want the tool to explicitly identify the URLs that triggered errors so that I can investigate them further if needed.

Acceptance Criteria:

Additional Information:

This enhancement will significantly improve the user experience by making the bulk URL analysis tool more resilient and informative.

milindmore22 commented 2 months ago

Hi @tsunoyu,

Thanks for reporting the error. You're absolutely right that errors should be handled gracefully to ensure the process continues for the rest of the URLs in the sitemap.

Unfortunately, we haven't been able to replicate the issue at our end. Please share the specific sitemap or CSV file you used so we can try to reproduce it and find a solution for the same.

amedina commented 2 months ago

@tsunoyu, thanks for your request. This is part of our roadmap and such more robust behavior will be part of PSAT in an upcoming version. In a nutshell, the output of aggregated analyses (i.e. sitemap/csv) should be resilient to URL errors, and a summary of such errors should be part of the generated report. Will keep this issue open until we merge the corresponding changes.

maitreyie-chavan commented 2 months ago

A related issue - https://github.com/GoogleChromeLabs/ps-analysis-tool/issues/818