quinthar commented 4 months ago

Reported by @zsgreenwald here: https://expensify.slack.com/archives/C066HJM2CAZ/p1715280747390969

Upwork Automation - Do Not Edit

Upwork Job URL: https://www.upwork.com/jobs/~019eb5f4d74fc52968
Upwork Job ID: 1788937751790014464
Last Price Increase: 2024-06-27
Automatic offers:

DylanDylann | Reviewer | 0

Issue Owner

Current Issue Owner: @alexpensify

quinthar commented 4 months ago

Optimistically assigning to @tgolen

tgolen commented 4 months ago

It looks like this is because we don't support <video> tags in our HTML to markdown conversion.

@kidroca Do you agree that we just need to add support for <video> tags, like what you did with the <img /> ones?

kidroca commented 4 months ago

It looks like this is because we don't support <video> tags in our HTML to markdown conversion.

@kidroca Do you agree that we just need to add support for <video> tags, like what you did with the <img /> ones?

Yes, this seems to be the problem

We have to convert <video> to ![vid](src.mp4) and back to <video> in order to preserve the correct behaviour

melvin-bot[bot] commented 4 months ago

Triggered auto assignment to @laurenreidexpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

melvin-bot[bot] commented 4 months ago

Job added to Upwork: https://www.upwork.com/jobs/~019eb5f4d74fc52968

melvin-bot[bot] commented 4 months ago

Triggered auto assignment to Contributor-plus team member for initial proposal review - @DylanDylann (External)

tgolen commented 4 months ago

Great, thanks! Let's get this one worked on by a contributor.

dominictb commented 4 months ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

Editing a comment with a video removes the video

What is the root cause of that problem?

We don't support <video> tags in our HTML to markdown conversion, and conversion back from markdown to <video>

What changes do you think we should make in order to solve the problem?

Convert <video> to ![vid](src.mp4), we can add a rule in expensify-common to do this, similar to images. Some differences to note:
- The video will use data-expensify-source, not src like image
- The video will have closing tags, unlike image which doesn't

So the Regex will need to accommodate those differences

Pseudo-code:

{
    name: 'video',
    regex: /<video[^><]*data-expensify-source\s*=\s*(['"])(\S*?)\1(.*?)>([^><]*)<\/video>*(?![^<][\s\S]*?(<\/pre>|<\/code>))/gi,
    replacement: (match, g1, g2, g3, g4) => {
        if (g4) {
            return `![${g4}](${g2})`;
        }

        return `!(${g2})`;
    },
},

This works well for current use cases but some details might need to be changed based on expectations

Convert video markdown back to <video>, we can add a rule in expensify-common to do this, similar to images. The difference with image is that the src of the video will have a video file extension (like .mp4)

Or we can just modify the rule here to work for both images and videos, based on source file extension

{
    name: 'imageOrVideo',
    regex: MARKDOWN_IMAGE_REGEX,
    replacement: (match, g1, g2) => {
        if (g2.endsWith('.mp4')) { // we should handle other video file extensions too
            return `<video src="${Str.sanitizeURL(g2)}">${g1 ? `${g1}` : ''}<\/video>`
        }
        return `<img src="${Str.sanitizeURL(g2)}"${g1 ? ` alt="${this.escapeAttributeContent(g1)}"` : ''} />`
    },
    // will need to update this similarly
    rawInputReplacement: (match, g1, g2) => `<img src="${Str.sanitizeURL(g2)}"${g1 ? ` alt="${this.escapeAttributeContent(g1)}"` : ''} data-raw-href="${g2}" data-link-variant="${typeof (g1) === 'string' ? 'labeled' : 'auto'}" />`
},

We should take additional references from this PR too to make it work for videos like for images.

What alternative solutions did you explore? (Optional)

NA

Video

It will work fine like this after the fix https://github.com/Expensify/App/assets/165644294/365b3911-4854-4fa8-830c-ca65c9846119

dominictb commented 4 months ago

@tgolen @kidroca To confirm:

We have some details in the <video> like data-expensify-width, data-expensify-height, data-expensify-thumbnail-url Do they need to be included in the markdown? If yes, what will the markdown look like with those metadata?
I assume the vid referred to here is the file name right?
```
![vid](src.mp4)
```
Because the format of the html is <video ...>fileName</video>. Or is the [vid] a hardcoding thing to distinguish the video markdown from the image markdown?

(FWIW it's ok for the markdown formats of video and image to be the same, we can distinguish them by the source file extension)

dominictb commented 4 months ago

Proposal updated to include more details

DylanDylann commented 4 months ago

In my opinion, In reality there are not many cases in which users want to update a video/image by updating the video/image link. When users send a video/image with a message, we should only allow the user to edit the message, not the video/image link. It means that when editing the message that is attached to a video/image, we only display the message in the edit composer and exclude the video/image link

tgolen commented 4 months ago

@dominictb I think it's OK to use the same markdown for both images and videos and base all logic off of file extensions.

We have some details in the

I think we do need to keep this, yes. Is there anyway to keep this without including it in the markdown syntax? They are fields that shouldn't be edited by the user.

@DylanDylann I think there are quite a few cases where users need to replace or delete a video/image entirely, so I don't think stripping out video/img tags from the editor is an option at this point.

DylanDylann commented 4 months ago

If that, @dominictb's proposal looks promising. Let's me deep dive into it

DylanDylann commented 4 months ago

@dominictb Same question

Is there anyway to keep this without including it in the markdown syntax?

dominictb commented 4 months ago

I think we do need to keep this, yes. Is there anyway to keep this without including it in the markdown syntax?

@tgolen @DylanDylann We can store the metadata in a local cache or in Onyx, it will be like an object with the video source as the key, similar to how we cached the thumbnail dimensions

We can initialize the cache when the user starts editing the message, and clear the cache after the user finishes editing

tgolen commented 4 months ago

That sounds great, I like it.

DylanDylann commented 4 months ago

Look good to me. Let's go with their proposal

🎀 👀 🎀 C+ reviewed

melvin-bot[bot] commented 4 months ago

Current assignee @tgolen is eligible for the choreEngineerContributorManagement assigner, not assigning anyone new.

melvin-bot[bot] commented 4 months ago

📣 @DylanDylann 🎉 An offer has been automatically sent to your Upwork account for the Reviewer role 🎉 Thanks for contributing to the Expensify app!

Offer link Upwork job

melvin-bot[bot] commented 4 months ago

📣 @dominictb You have been assigned to this job! Please apply to the Upwork job and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻 Once you apply to this job, your Upwork ID will be stored and you will be automatically hired for future jobs! Keep in mind: Code of Conduct | Contributing 📖

kidroca commented 4 months ago

I think we do need to keep this, yes. Is there any way to keep this without including it in the markdown syntax?

@tgolen @DylanDylann We can store the metadata in a local cache or in Onyx. It will be like an object with the video source as the key, similar to how we cached the thumbnail dimensions.

We can initialize the cache when the user starts editing the message and clear the cache after the user finishes editing.

How would this work? Currently, the MD to HTML and vice versa conversion is handled solely by Expensimark and the provided markup. How would you extract data-expensify-width, data-expensify-height, and data-expensify-thumbnail-url, store them in cache, and then reinsert them when the MD is converted back to HTML, given that Expensimark is unaware of the context and parses only the markup it sees?

What would happen if the user changes the source while editing the comment? The width, height, and thumbnail attributes would no longer be relevant to the new source. These attributes are generated by the backend, so wouldn't it be better to let the backend regenerate them when absent instead of trying to cache them?

I like @DylanDylann's idea:

In reality, there are not many cases where users want to update a video/image by updating the link. When users send a video/image with a message, we should only allow them to edit the message, not the video/image link. This means that when editing a message with an attached video/image, we only display the message in the edit composer and exclude the video/image link.

The app can allow users to remove and attach a new file instead of freely tweaking the source and causing problems. Nothing good can come out by manually editing a source like !(https://www.expensify.com/chat-attachments/5313484134548058856/w_a7534b3770eec5f57f4cbc51e595ffc53d837958.webp); either the whole thing should be removed or none of it.

We have a similar problem with images, but it seems unnoticed.

https://github.com/Expensify/App/assets/12156624/58ed8062-5520-41f2-a971-dc291af9956f

Steps to recreate:

Type text and add an image attachment.
After sending the comment, inspect the <img> tag for data-expensify attributes.
Edit the comment and make a trivial change to the text before the attachment.
Observe that the <img> tag no longer has data-expensify attributes.
Disable cache in dev tools, or clear cache and reload the page.
Observe that the edited comment no longer loads the image.

Bug: When the cache expires, the image can no longer be loaded.

Because data-expensify-source attribute is no longer on the <img> tag the required authentication header is not applied on the image request

Expected: Editing a comment, clearing cache, and reloading the page should display the comment and the image attachment.

Actual: Clearing cache and reloading shows the comment and a placeholder box instead of the image attachment.

tgolen commented 4 months ago

These attributes are generated by the backend, so wouldn't it be better to let the backend regenerate them when absent instead of trying to cache them?

Ah, this is a very good point. You're right that these probably should come with the backend.

Let's step back for a second to make sure we're thinking about the requirements fully. Assuming you have a text+attachment message, there could be the following scenarios:

I only want to edit the text, leave the attachment alone
I only want to replace the attachment, leave the text alone
I want to edit the text, and replace the attachment

Do we all agree those are the cases that should ideally be covered?

DylanDylann commented 4 months ago

Even though when uploading an attachment independently (without message) we also don't have any option to edit the attachment

DylanDylann commented 4 months ago

@tgolen From my point of view, I prefer

I only want to edit the text, leave the attachment alone

And I think it also be @kidroca's thought

The app can allow users to remove and attach a new file instead of freely tweaking the source and causing problems. Nothing good can come out by manually editing a source like !(https://www.expensify.com/chat-attachments/5313484134548058856/w_a7534b3770eec5f57f4cbc51e595ffc53d837958.webp); either the whole thing should be removed or none of it.

tgolen commented 4 months ago

I want to be careful here that technical difficulty isn't what is standing in the way of the user experience. In GH for example, you can freely edit the markdown after uploading a file, and it seems to be totally fine. Yes, you can muck with the source and screw it up if you want to, but that's what editing markdown is all about. We don't need to protect users against themselves.

DylanDylann commented 4 months ago

@dominictb Let's go on the PR phase

melvin-bot[bot] commented 4 months ago

📣 @lantrungseo! 📣 Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork. Please follow these steps:

Make sure you've read and understood the contributing guidelines.
Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details. Format:
```
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>
```

dominictb commented 4 months ago

@DylanDylann @tgolen @kidroca I believe this is a BE bug: Trying to edit the comment with video, saw that the BE returned a different format.

laurenreidexpensify commented 4 months ago

PR in draft, not overdue

dominictb commented 4 months ago

@DylanDylann @tgolen @laurenreidexpensify pls help me check this issue https://github.com/Expensify/App/issues/41952#issuecomment-2124268064, then I'll get my PR ready for review. Currently I couldn't record the video evidence because of this.

melvin-bot[bot] commented 4 months ago

@tgolen @laurenreidexpensify @DylanDylann @dominictb this issue was created 2 weeks ago. Are we close to approving a proposal? If not, what's blocking us from getting this issue assigned? Don't hesitate to create a thread in #expensify-open-source to align faster in real time. Thanks!

dominictb commented 4 months ago

@tgolen @laurenreidexpensify @DylanDylann bump again for https://github.com/Expensify/App/issues/41952#issuecomment-2124268064

melvin-bot[bot] commented 4 months ago

@tgolen, @laurenreidexpensify, @DylanDylann, @dominictb Huh... This is 4 days overdue. Who can take care of this?

laurenreidexpensify commented 4 months ago

@DylanDylann can you please check PR to help @dominictb thanks

DylanDylann commented 4 months ago

@laurenreidexpensify Thanks for your reminder

DylanDylann commented 4 months ago

@tgolen Could you take a look at @dominictb's comment? It appears that the UpdateComment API doesn't support video tags in its parameters. From my viewpoint, we have two options to address this:

Modify the backend to allow the UpdateComment API to accept video tags in the parameters
Convert the video tag to an anchor tag and pass it as a parameter to the UpdateComment API

laurenreidexpensify commented 4 months ago

Just a heads up @tgolen is currently offline, he'll be back on Monday to address this

DylanDylann commented 4 months ago

Not overdue, waiting for @tgolen on the above comment

DylanDylann commented 4 months ago

Whoops, my comment can't remove the overdue label because I don't have a yellow star 😄

laurenreidexpensify commented 4 months ago

@tgolen bump for a response when you online later thanks

melvin-bot[bot] commented 3 months ago

@tgolen, @laurenreidexpensify, @DylanDylann, @dominictb Huh... This is 4 days overdue. Who can take care of this?

tgolen commented 3 months ago

Hey all, thanks for being patient while I was OOO last week. I'll look into the problem with UpdateComment API command. It looks like it's stripping out the tags, which I didn't know about.

tgolen commented 3 months ago

I've opened an internal issue https://github.com/Expensify/Expensify/issues/402097 to track this and I am looking into it. I'm currently running into some roadblocks because we use https://htmlpurifier.org/ which has terrible documentation and doesn't natively support <video>. There is an updated version https://github.com/xemlock/htmlpurifier-html5 which I tried, but it leads to other cryptic error messages that I have to investigate HTMLPurifier_Exception: Inconsistent use of optimized and unoptimized raw definition retrievals (this backtrace is for the first inconsistent call, which was for a unoptimized raw definition).

tgolen commented 3 months ago

Daily Update

I got the errors figured out and have a PR with support for <video> tags

Next Steps

Get the PR deployed

ETA

Deployed to staging by Friday, June 7

tgolen commented 3 months ago

Daily Update

@dominictb The PR to stop stripping the <video> tags has been deployed to staging

Next Steps

@dominictb finish the front-end work here

dominictb commented 3 months ago

@tgolen I will wrap up the draft PRs and get them ready for review asap.

kidroca commented 3 months ago

Regarding the selected proposal,

I believe the data-expensify-source attribute is intended only for content hosted through Expensify—specifically, files that users upload via NewDot. External images (and possibly videos) posted using Markdown syntax like ![img](...) don't have this attribute.

The data-expensify-source attribute is used to determine whether the request should include an authentication header. This header should not be added to external resources, meaning the ExpensiMark parser shouldn't apply it indiscriminately in HTML.

Alternatively, authentication could be determined based on the address itself, rather than the presence of the attribute.

tgolen commented 3 months ago

Weekly Update

@tgolen and @dominictb are still going through a few final review comments in the Expensify-Common PR

Next Steps

@tgolen finish reviewing and merge https://github.com/Expensify/expensify-common/pull/703
@dominictb create a PR for App that updates the expensify-common version

ETA

All PRs merged by Wednesday, June 19

melvin-bot[bot] commented 3 months ago

Triggered auto assignment to @alexpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

laurenreidexpensify commented 3 months ago

@alexpensify adding you here for parental leave sub. no action required this BZ this week, PR is still in review, but adding you for payment in coming weeks when the fix makes it to prod. Thanks

Expensify / App

[HOLD for payment 2024-06-28] [$500] HIGH: [Polish] Editing a comment with a video removes the video #41952

Proposal

Please re-state the problem that we are trying to solve in this issue.

What is the root cause of that problem?

What changes do you think we should make in order to solve the problem?

What alternative solutions did you explore? (Optional)

Video

Daily Update

Next Steps

ETA

Daily Update

Next Steps

Weekly Update

Next Steps

ETA