integration with form - Githubissues

kishore03109 commented 8 months ago

Problem

Allows us to run the script and update the isomer link checker repo directly. NOTE: this is not just for observability, the function calls in RepoCheckerService will be used in the subsequent PRs as well!

Solution

Hook to formsg

Breaking Changes

[ ] Yes - this PR contains breaking changes
- Details ...
[X] No - this PR is backwards compatible with ALL of the following feature flags in this doc

Features:

// todo run this in staging env and notice that in staging, the things have been running as expected

Improvements:

Details ...

Bug Fixes:

Details ...

Before & After Screenshots

BEFORE:

AFTER:

Tests

Deploy Notes

New environment variables:

SITE_CHECKER_FORM_KEY : form key bah
- [ ] added env var to 1PW + SSM script (fetch_ssm_parameters.sh)

New scripts:

script : script details

New dependencies:

papaparse : easily parse the errors data objects into csv with ease

New dev dependencies:

dependency : dependency details

kishore03109 commented 8 months ago

Current dependencies on/for this PR:

#1148
#1121 👈
#1119
#1118
develop

This stack of pull requests is managed by Graphite.

kishore03109 commented 8 months ago

This method requires a manual trigger on our end and doesn't enforce the link fix. Should we instead just make the checking a GitHub action and add it to our repos + enforce that merges to production have 0 failures? GitHub exposes an actions API so this should be doable! We can expose the output as a txt artifact and show it to admins too.

Hey @seaerchin PR is only scoped to get observability for now to justify to ourselves that this is a worthy problem to be solved. This is not yet close to the final output of what the users will see via the CMS with regards to broken links!

seaerchin commented 8 months ago

This method requires a manual trigger on our end and doesn't enforce the link fix. Should we instead just make the checking a GitHub action and add it to our repos + enforce that merges to production have 0 failures? GitHub exposes an actions API so this should be doable! We can expose the output as a txt artifact and show it to admins too.

Hey @seaerchin PR is only scoped to get observability for now to justify to ourselves that this is a worthy problem to be solved. This is not yet close to the final output of what the users will see via the CMS with regards to broken links!

ok so let me clarify - if the medium that we're choosing now (presenting a leaderboard) is a temporary measure to get feedback from users, this is fine!

my main worry is that there's not much incentive for users to go fix their links and hence, i'd be more in favour of just piggybacking off gha, which solves the problem of forcing users to fix their broken links cleanly w/ no need for manual triggers.

if this is just for proto, lmk and i'll take a look/most probably approve

kishore03109 commented 8 months ago

@seaerchin

ok so let me clarify - if the medium that we're choosing now (presenting a leaderboard) is a temporary measure to get feedback from users, this is fine! my main worry is that there's not much incentive for users to go fix their links and hence, i'd be more in favour of just piggybacking off gha, which solves the problem of forcing users to fix their broken links cleanly w/ no need for manual triggers. if this is just for proto, lmk and i'll take a look/most probably approve

Hey its not for external consumption! will be covering soon on how we will present it to users ya. But yes the mental model here is a proto, since we are not aware how users feel about their sites/are deligent about say older pages with broken links. This exists as a sot to track this features success. if our actions dont shift the needle, this is the yardstick we will use to shift and change tactics

seaerchin commented 8 months ago

if this proto is jsut to get feedback, could we:

rename vars from a, b to somethign more descriptive
add error for empty href/img src
separate sub methods out (checker/reporter etc)
fix the async + forEach

rest can ignore

kishore03109 commented 8 months ago

hmm, no not exactly - the error here is a derived state of our repos. we shouldn't need to declare it at hte top and then do a .push - this loses locality and makes us prone to errors (scope is within lifetime of function). we might, instead, want to have smaller methods that just return each category of error and we do the combine ourselves:

hmm wait arh while this is more declarative it is a lot less performant leh, and will scale up to (c*n) time complexity, where c is the number of errors that we acknowledge to exist which is a growing number as we are doing more user interviews! maybe a bit of context here is that in our user interviews, they did mention time taken to run the scan is a impt factor to whether they will use this feature, and actually after looking at the errors we are noticing multiple types of errors, which we might need to further separate. while we havent made a final decision on the type of errors we want, i do want to maintain this pattern for performance reasons, lmk if you have concerns against this

seaerchin commented 8 months ago

hmm wait arh while this is more declarative it is a lot less performant leh, and will scale up to (c*n) time complexity, where c is the number of errors that we acknowledge to exist which is a growing number as we are doing more user interviews! maybe a bit of context here is that in our user interviews, they did mention time taken to run the scan is a impt factor to whether they will use this feature, and actually after looking at the errors we are noticing multiple types of errors, which we might need to further separate. while we havent made a final decision on the type of errors we want, i do want to maintain this pattern for performance reasons, lmk if you have concerns against this

I think splitting the errors up into smaller methods and calling them shouldn't incur extra runtime - the check now is just consolidated into 1 method, which is significantly harder to reason about and extend, and each individual check still takes time.

Under our current model, we do image etc + a href check (2) in 1 pass against all internal links (n).

If we separate out the check so that we call the check by each internal link and not by the list of internal links, it should still be the same run time (was just using an eg previously)

kishore03109 commented 8 months ago

Merge activity

Feb 22, 3:02 AM EST: @kishore03109 started a stack merge that includes this pull request via Graphite.
Feb 22, 3:08 AM EST: Graphite failed to merge this pull request due to internal failures. Try your merge again, or report a bug if you see this consistently.

isomerpages / isomercms-backend

integration with form #1121

Problem

Solution

Before & After Screenshots

Tests

Deploy Notes

Merge activity