MozillaFoundation / foundation.mozilla.org

Mozilla Foundation website
https://foundation.mozilla.org
Mozilla Public License 2.0
382 stars 153 forks source link

Better URL validation and URL types #10905

Open danielfmiranda opened 1 year ago

danielfmiranda commented 1 year ago

As we move forward with our implementation of Fundraise Up, we have gotten requests from users to be able to update URLs with relative URLs (Ex: ?form=donate), such as https://github.com/MozillaFoundation/foundation.mozilla.org/pull/10897 and https://github.com/MozillaFoundation/foundation.mozilla.org/issues/10514.

At the moment, this is achieved by using a Charfield with a custom regex validator that we have defined in constants.py however, I anticipating that we will get this request again as time goes on, and think it would be a idea to discuss if this is the best approach or if we can come up with a better one.

┆Issue is synchronized with this Jira Epic

mtdenton commented 1 year ago

@Gbaldassano @danielfmiranda Curious if @jhonatan-lopes has a strong opinion here or whether or not we should talk this out in the ENG sprintly meeting

jhonatan-lopes commented 1 year ago

To be honest, I always have reservations with re-implementing URL validators in-house, because of how easy it is to do it wrong.

I'm just thinking if we have something better in Wagtail to take care of this. Maybe we could use a StreamField with StreamBlocks and controlled options for these blocks, something like:

donate_url = StreamField(
        blocks.StreamBlock(
            [
                (
                    "external_link",
                    blocks.StructBlock(
                        [("url", blocks.URLBlock()), ("title", blocks.CharBlock())],
                        icon="link",
                    ),
                ),
                (
                    "relative_url",
                    blocks.StructBlock(
                        [
                            ("relative_url", blocks.RelativeURLBlock()),
                            ("title", blocks.CharBlock()),
                        ],
                        icon="link",
                    ),
                ),
            ],
            max_num=1,
            required=True,
        ),
        blank=True,
    )

This way, we would need to implement a RelativeURLBlock (and its regex validation) and some methods for getting the link. Since the validation for the relative URL is much simpler, maybe this could be a better alternative. The UX/UI would also be better since it would allow the user to choose between an external link and a relative link.

mtdenton commented 11 months ago

Let's talk about this in sprintly

jhonatan-lopes commented 11 months ago

Found this LinkButtonBlock today in the codebase, which is also using a CharBlock to handle relative urls:

class LinkButtonBlock(blocks.StructBlock):
    label = blocks.CharBlock()

    # We use a char block because UrlBlock does not
    # allow for relative linking.
    URL = blocks.CharBlock()
    ....
simont-cr commented 4 months ago

@jhonatan-lopes Is there any further investigation or discussion that we would need as a team for this particular ticket? I see that the related ticket https://github.com/MozillaFoundation/foundation.mozilla.org/issues/10846 is already closed, so I'm curious if we should be keeping this as well.

jhonatan-lopes commented 4 months ago

Hey @simont-cr, I think we should go ahead with wagtail-link-block, which allows editors to select the type of link that they want. I've implemented this for the LinkBlock and LinkButtonBlock (along with some Mozfest ones) on this PR: https://github.com/MozillaFoundation/foundation.mozilla.org/pull/11954

jhonatan-lopes commented 4 months ago

We need to migrate the following to use some form of the LinkBlock.

Blocks that use links

List of blocks that use a CharBlock or URLBlock for links:

Models that use links:

Mozfest:

Wagtailpages

jhonatan-lopes commented 4 months ago

Other considerations

There are several places where we see CMS links being hardcoded into content. For instance, a blog page wants to link to the PNI homepage, so instead of using a page selector, they put a link to "foundation.mozilla.org/en/privacy-not-included".

Although this could arguably be acceptable for PNI (which is not expected to move ever), it is awful for any other pages. If the page moves, the URL won't be updated. Worse, a redirection loop can be introduced. We also lose localisation because a lot of times the "en" version is hardcoded. Finally, syncs between local/prod/staging do not work and always point to production because the links are hardcoded.

We should investigate if it's possible to migrate those links are we are doing this correction work and put better guards in place to make sure that editors can not put external/relative links that contain the domain for the current site.

marcwalsh commented 4 months ago

We should investigate if it's possible to migrate those links are we are doing this correction work and put better guards in place to make sure that editors can not put external/relative links that contain the domain for the current site.

A couple of pieces I just wanted to flag around our current use cases is that linking to anchor links on specific pages means we use over page picker.

Also, adding Tito variables either by themselves in the URL boxes without validation allows us open the Tito window on elements that do not have a Tito Ticket picker on.

Example of Tito link: ?tito=%2FMozilla%2Fmozfest-house-amsterdam-2024%2Fen%2Fregistrations%2Fnew%3Freleases%3Dcommunity-pass%26prefill%3D%257B%257D

Or when linking internally, eg when the URL is validated, we would link to this as an example: https://www.mozillafestival.org/?tito=%2FMozilla%2Fmozfest-house-amsterdam-2024%2Fen%2Fregistrations%2Fnew%3Freleases%3Dcommunity-pass%26prefill%3D%257B%257D

danielfmiranda commented 1 month ago

Hi Everyone!

I have been working on TP1-60 and have noticed an issue that we should address if we want to use the LinkBlock for link fields across the codebase, as proposed in the ticket.

The current LinkBlock class, introduced with the Main Nav work, allows users to create links of various types (internal, external, file, phone, email) with built-in validation. While this could be useful for our existing URL fields, which at the moment are just CharFields, it has a required label field inherited from its base class, BaseLinkBlock.

This makes sense in the navigation context, where every URL would need a label to accompany it. However, this required field complicates its use in other parts of the codebase.

For example, the AnnotatedImageBlock currently has a caption field and a captionURL field, both optional, which are used to render the image's caption as a link to another page. The caption already serves as a label for the image, and the captionURL simply provides a link: Screenshot 2024-06-05 at 16-15-16 Mozilla Foundation - Donate Help

However, updating the captionURL field to use the LinkBlock introduces an additional required label field. This required label field is redundant in this context because the caption already functions as the label. Adding this extra field makes the CMS structure a little confusing:

Screenshot 2024-06-05 at 15-27-47 Editing Donate help page Donate Help - Wagtail

I anticipate the same redundancy issue with the following other stream blocks/fields in the codebase:

CardGridBlocklink_url and link_label ImageTextBlockurl and text ImageGridBlockurl iFrameBlockurl and captionURL ListingCard / ListingBlocklink_url and link_page GroupListingCardurl ImageTeaserBlockurl_label and url TextOnlyTeaserlink_page and link_url

To solve this, I propose decoupling the label field from the BaseLinkBlock class, and creating a new NavLinkBlock class that includes the label field, specifically for navigation items.

Once this is done, updating existing URL fields will be straightforward: we can simply update the link field to LinkBlock and migrate the old data over, rather than finding ways to work with the label field in existing models. This would make completing the epic a little bit easier on the devs.

I have created https://mozilla-hub.atlassian.net/browse/TP1-775 to capture this work, and will start working on it now!

jhonatan-lopes commented 1 month ago

Hey @danielfmiranda, I was aware of this issue when I wrote up the BaseLinkBlock. The idea was to make sure that devs were intentional in where they set the label so that we don't run into the same issues again and to have a unified approach towards links, e.g. they should link somewhere and have a label -- that's the basic functionality.

In places where label is replaced by something else (like the caption), I would rather remove the label in that place only than try to rewrite the whole BaseLinkBlock for this. I think this will result in the same problems down the line: some links will have labels, others won't and that's an issue.

For blocks that currently have a label but with a different name, I would do a data migration to convert the old field into a label now. There are examples on how to do this on the "example implementation" on the nav work. For CardGrid, for instance, the migration would be to migrate the link that is on link_url to external_url and link_label to the label field, and then delete link_label and link_url.

For blocks where we don't want to keep the label (thinking quite carefully about this), we can remove the label from the block in the init method of the derived class. Something like:

class MyBlockWithoutLabel(BaseLinkBlock):
    def __init__(self, local_blocks=None, **kwargs):
        super().__init__(local_blocks, **kwargs)
        self.child_blocks = self.base_blocks.copy()
        self.child_blocks.pop("label") # this will remove "label" from the child blocks
danielfmiranda commented 1 month ago

Thanks @jhonatan-lopes!

This simplifies things greatly 🙌 and would remove the need of creating any more classes, besides the "BlockWithoutLabel" one.

danielfmiranda commented 1 month ago

Hi @jhonatan-lopes! I have implemented your suggestion and its working great in terms of removing the label field when its not needed.

However, one thing I am noticing is that the "link_to" required choice select box is preventing the page from being saved in situations where the URL field should be optional. Screenshot 2024-06-06 at 16-28-21 Editing Donate help page Donate Help - Wagtail

I am looking into solutions for this now, but was wondering if you may have any suggestions that come to mind?

Thanks!

jhonatan-lopes commented 1 month ago

You could wrap it up in a ListBlock:

class MyBlockWithOptionalLink(StructBlock):
    link = ListBlock(LinkBlock, min_num=0, max_num=1)

where LinkBlock is the one that you wrote without a label (it has to be a concrete implementation of BaseLinkBlock).

Another thing I'm seeing there is that the JS is not hiding the link fields dynamically. To do that, you need to register concrete implementations of BaseLinkBlock with its adapter. See https://github.com/MozillaFoundation/foundation.mozilla.org/blob/67bfafa42eff1051da722d96640e8546c50a5755/network-api/networkapi/nav/blocks.py#L50 for an example

danielfmiranda commented 1 month ago

Thanks @jhonatan-lopes! 👍