Expensify / App

Welcome to New Expensify: a complete re-imagination of financial collaboration, centered around chat. Help us build the next generation of Expensify by sharing feedback and contributing to the code.
https://new.expensify.com
MIT License
3.55k stars 2.89k forks source link

[HOLD for payment 2023-05-05] [$1000] URL validator fails in some cases #17205

Closed kavimuru closed 1 year ago

kavimuru commented 1 year ago

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Action Performed:

  1. Open the Settings menu.
  2. Click on "Workspaces."
  3. Select "Connect Bank Account."
  4. Click on "Company Information (step 2 of 5)."
  5. Click on “Company Website”.
  6. Enter any URL but remove the dot (.) from it.
  7. Notice no error is displayed on wrong URL even though the correct format for the URL (www.expensify.com) is provided below the field.

Expected Result

An error should be displayed for an invalid URL.

Actual Result

No error message is displayed for an invalid URL, regardless of the format provided below the “Company Website” field.

Workaround:

unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

Version Number: 1.2.97-2 Reproducible in staging?: y Reproducible in production?: y If this was caught during regression testing, add the test name, ID and link from TestRail: Email or phone of affected tester (no customers): Logs: https://stackoverflow.com/c/expensify/questions/4856 Notes/Photos/Videos: Any additional supporting documentation

delete account

https://user-images.githubusercontent.com/43996225/230752518-7e8a2a03-7907-46c0-a1f2-1e80b3ab57a8.mp4

Expensify/Expensify Issue URL: Issue reported by: @ashimsharma10 Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1680991312415289

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01c611185cf24bd850
  • Upwork Job ID: 1645780235819393024
  • Last Price Increase: 2023-04-11
MelvinBot commented 1 year ago

Triggered auto assignment to @puneetlath (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

MelvinBot commented 1 year ago

Bug0 Triage Checklist (Main S/O)

MelvinBot commented 1 year ago

Job added to Upwork: https://www.upwork.com/jobs/~01c611185cf24bd850

MelvinBot commented 1 year ago

Current assignee @puneetlath is eligible for the External assigner, not assigning anyone new.

MelvinBot commented 1 year ago

Triggered auto assignment to Contributor-plus team member for initial proposal review - @thesahindia (External)

MelvinBot commented 1 year ago

Current assignee @puneetlath is eligible for the External assigner, not assigning anyone new.

puneetlath commented 1 year ago

Making external.

allroundexperts commented 1 year ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

Website field validation does not work in the connect bank account page.

What is the root cause of that problem?

The root cause is that we're using an incorrect regular expression to match the URL. This can be seen here. The Regex defined here considers any URL to be valid if it contains a dot in it. It does not validate the TLD of the domain.

What changes do you think we should make in order to solve the problem?

We need to use the URL regex that is defined in the expensify-common library. That regex is much more comprehensive and does the TLD validation as well. As such, we need to import the following here:

import {URL_WEBSITE_REGEX} from  'expensify-common/lib/Url';

Then in the isValidWebsite function, we need to replace the currently existing line with:

    return new RegExp(`^${URL_WEBSITE_REGEX}$`, 'i').test(url);

We should also remove the regex defined here as its not needed.

Note: This will validate the URLs which have any path added to them such as https://google.com/abcd. I think this should be fine but if this is an issue, then we can add another Regex here excluding the path fragment.

What alternative solutions did you explore? (Optional)

None

nhminhduc commented 1 year ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

URL validator fails in some cases

What is the root cause of that problem?

Root cause of this problem is that the WEBSITE regex constant does not cover all cases.

Screenshot 2023-04-11 at 19 52 04 Screenshot 2023-04-11 at 19 52 36 Screenshot 2023-04-11 at 19 52 42

This regex will take www as a part and require only 1 section after www. Using URL_WEBSITE_REGEX from expensify-common library will accept https://www.eu, etc...

Screenshot 2023-04-11 at 21 07 58

What changes do you think we should make in order to solve the problem?

Add negative regex that will only match "https://www.abc". Only value that match regex and does not pass negative regex will be validated. ^(https?|ftp):\/\/(www\.[a-zA-Z0-9]+$) Combining both expensify-common library and negative regex will solve this problem.

What alternative solutions did you explore? (Optional)

  1. Use validatorjs package
  2. Use regex that cover this case: https://stackoverflow.com/questions/3809401/what-is-a-good-regular-expression-to-match-a-url (2nd answer) However the it's to heavy.
MelvinBot commented 1 year ago

📣 @nhminhduc! 📣

Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork. Please follow these steps:

  1. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  2. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  3. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.

Screen Shot 2022-11-16 at 4 42 54 PM

Format:

Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>
nhminhduc commented 1 year ago

Contributor details Your Expensify account email: nguyenhoangminhduc@gmail.com Upwork Profile Link: https://www.upwork.com/freelancers/~01b1f6653607fd5e30

MelvinBot commented 1 year ago

✅ Contributor details stored successfully. Thank you for contributing to Expensify!

thesahindia commented 1 year ago

@allroundexperts, I tried your proposal. I used https://www.google and didn't get any error.

@nhminhduc, can't we just improve the regex?

nhminhduc commented 1 year ago

@thesahindia Yes, as in the alternative stackoverflow, there is a regex that can solve this. After a bit of modification ^((https?|ftp):\/\/)?(?:www\.|(?!www\.))(([a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]|[a-zA-Z0-9]+)\.)+(([a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9]|[a-zA-Z0-9]){2,})\/?$ The above will not let https://www.com and similar to pass through. However this doesn't check for TLD and there can be unlimited . (current regex also has the same problem).

Screenshot 2023-04-12 at 14 07 51

Also this will not let url with part after / goes through (not sure if this is good or bad thing) ftp://example.com/thiswillnotpassthrough This can be fixed according to your wish by removing \/? at the end of regex.

Website to test regex against list of url: https://rubular.com/r/bfclNFu3lzDGhA

allroundexperts commented 1 year ago

@thesahindia https://www.google is a valid URL because google is a TLD as well. Please check this link for a complete list of TLDs. You can check the TLDs that we've defined in our library here.

Prince-Mendiratta commented 1 year ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

The URL validator fails for certain website URLs.

What is the root cause of that problem?

It is due to insufficient regex validation, as pointed out by @allroundexperts.

What changes do you think we should make in order to solve the problem?

We need to enforce a better regex validation standard. https://github.com/Expensify/App/blob/e8bc6a86f13fd2f5561de5e6251abffa8f890384/src/libs/ValidationUtils.js#L36-L42.

^(https?|ftp):\/\/([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}(:[0-9]{1,5})?(\/[\w.-]+)*(\/)?(\?[^\s]*)?(#[^\s]*)?$

This will cover all the follwoing cases:

I believe it is required to let the user enter any URL as you never know what kind of company website the user has and if they got any hacks/workarounds that redirect users and/or require extra identifiers.

ashimsharma10 commented 1 year ago

Dear Contributors Entering a URL with a domain name that is longer than 63 characters should also be checked, I guess.

For example, "https://www.thisisaverylongdomainnamethatexceedsthelimitof63characters.com/" instead of "https://www.example.com/".

nhminhduc commented 1 year ago

@ashimsharma10 made a new regex to limit the domain name to maximum 63 characters: ^((https?|ftp):\/\/)?(?:www\.|(?!www\.))(([a-zA-Z0-9][a-zA-Z0-9-]{0,61}[a-zA-Z0-9]|[a-zA-Z0-9])\.){1,}(?=[a-zA-Z0-9]{2,63}\.?)([a-zA-Z0-9][a-zA-Z0-9-]{0,61}[a-zA-Z0-9]|[a-zA-Z0-9])\/?$ https://rubular.com/r/ubhfAI5qeCAa9I

thesahindia commented 1 year ago

Yes google is a valid TLD but www.{TLD} is not a valid FQDN. I think we should check for valid FQDN. @Puneet what are your thoughts?

I'm uncertain whether we need to validate that each label's length does not exceed 63 characters. In my opinion, I don't think it's a practical scenario to consider. What do you think about this?

allroundexperts commented 1 year ago

Yes google is a valid TLD but www.{TLD} is not a valid FQDN. I think we should check for valid FQDN. @puneet what are your thoughts?

I'm uncertain whether we need to validate that each label's length does not exceed 63 characters. In my opinion, I don't think it's a practical scenario to consider. What do you think about this?

@thesahindia We are using the above everywhere else on our website. We can modify it to not include www. https://blog.google/ is a valid URL btw.

puneetlath commented 1 year ago

Hm, interesting. I like the proposal to use the regex in expensify-common. Perhaps we can improve what is in expensify-common so that it doesn't allow www.{TLD}.

allroundexperts commented 1 year ago

Hm, interesting. I like the proposal to use the regex in expensify-common. Perhaps we can improve what is in expensify-common so that it doesn't allow www.{TLD}.

@puneetlath Yes. I agree with this.

allroundexperts commented 1 year ago

@puneetlath @thesahindia I am thinking more about this. I think we should not edit the regex in the expensify-common repository. The reason being that there is no restriction on the Subdomain or Domain Name regarding the alphabet patterns. I think www is a valid domain name for a given tld.

Screenshot 2023-04-13 at 11 39 16 PM
puneetlath commented 1 year ago

Hm, yes seems you're right:

image

In that case, I think I like @allroundexperts original proposal. Do you see any issues @thesahindia?

thesahindia commented 1 year ago

Do you see any issues @thesahindia?

I didn't get any error when tried using special chars, so we will need to improve it.

allroundexperts commented 1 year ago

Do you see any issues @thesahindia?

I didn't get any error when tried using special chars, so we will need to improve it.

@thesahindia What string did you use?

thesahindia commented 1 year ago

@thesahindia What string did you use?

https://www.bb,

allroundexperts commented 1 year ago

@thesahindia What string did you use?

https://www.bb,

@thesahindia I've updated my proposal. I was missing ^ and $.

allroundexperts commented 1 year ago

16804 proposes to add awesome-phonenumber lib, does that cover this ticket?

Not really. This ticket is about url validation.

huzaifa-99 commented 1 year ago

Sry @allroundexperts, I wanted to comment on a different ticket 😅

thesahindia commented 1 year ago

C/P @ashimsharma10's comment

Dear Contributors Entering a URL with a domain name that is longer than 63 characters should also be checked, I guess.

For example, "https://www.thisisaverylongdomainnamethatexceedsthelimitof63characters.com/" instead of "https://www.example.com/".

@puneetlath, before moving forward can you please confirm if we wanna consider this case? In my opinion, I don't think it's a practical scenario to consider.

puneetlath commented 1 year ago

Yes, I agree. I think we're okay not worrying about that.

MelvinBot commented 1 year ago

📣 @allroundexperts You have been assigned to this job by @puneetlath! Please apply to this job in Upwork and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻 Keep in mind: Code of Conduct | Contributing 📖

huzaifa-99 commented 1 year ago

F

MelvinBot commented 1 year ago

@puneetlath, @allroundexperts, @thesahindia Uh oh! This issue is overdue by 2 days. Don't forget to update your issues!

puneetlath commented 1 year ago

PRs are in active review.

MelvinBot commented 1 year ago

Reviewing label has been removed, please complete the "BugZero Checklist".

MelvinBot commented 1 year ago

The solution for this issue has been :rocket: deployed to production :rocket: in version 1.3.7-3 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2023-05-05. :confetti_ball:

After the hold period is over and BZ checklist items are completed, please complete any of the applicable payments for this issue, and check them off once done.

As a reminder, here are the bonuses/penalties that should be applied for any External issue:

MelvinBot commented 1 year ago

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

puneetlath commented 1 year ago

@thesahindia @allroundexperts sent you offers.

@ashimsharma10 can you apply to the Upwork job? https://www.upwork.com/jobs/~01c611185cf24bd850

@thesahindia also a friendly reminder about the checklist!

thesahindia commented 1 year ago
  • [ ] [@thesahindia] The PR that introduced the bug has been identified. Link to the PR:

Not sure if we should call this a regression. l would prefer just checking off the first three items since this was a minor improvement.

Regression test proposal

  1. Open the Settings menu.
  2. Click on "Workspaces."
  3. Select "Connect Bank Account."
  4. Fill the info and proceed
  5. At company information steps use click on "Company Website” field.
  6. Use http://www.googlecom and click outside
  7. Verify you see an error
puneetlath commented 1 year ago

Not sure if we should call this a regression. l would prefer just checking off the first three items since this was a minor improvement.

I can see that. Works for me.

puneetlath commented 1 year ago

Checklist is done. Should be good to pay everyone on 5/5.

@ashimsharma10 bump on applying to the Upwork job? https://www.upwork.com/jobs/~01c611185cf24bd850

ashimsharma10 commented 1 year ago

@puneetlath Can you please send me an offer directly? I'm out of Upwork connects. Here is my Upwork : https://www.upwork.com/freelancers/~018a92cf13e1e88eed

puneetlath commented 1 year ago

All paid. Thanks everyone!