joke2k / faker

Faker is a Python package that generates fake data for you.
https://faker.readthedocs.io
MIT License
17.79k stars 1.93k forks source link

URNs produce the wrong result for .uri() method #2102

Open tomschr opened 2 months ago

tomschr commented 2 months ago

Thanks for this great project! :heart:

If you run the .uri() method and pass the URN (Uniform Resource Name) scheme, it returns the wrong result.

Steps to reproduce

Consider the following snippet:

>>> from faker import Faker
>>> fake = Faker()
>>> fake.uri(schemes=["urn"])
'urn://thompson.org/tagsauthor.htm'

Expected behavior

URNs are described in RFC 1737 and later in RFC 2141, I would have expected to have something like this:

urn:isbn:0451450523
urn:mpeg:mpeg7:schema:2001
urn:mrn:iala:pub:g1143

Find more examples in the Wikipedia article about Uniform Resource Name.

Actual behavior

At the moment, the URN scheme uses slashes (/) which is wrong according to the mentioned specs. As the method name .uri() implies, it's the generic term for URNs and URLs.

Possible solution

I can think about different solutions to solve this issue:

  1. Fix the .uri() method and return the correct result when a URN scheme is requested.
  2. Introduce a new .urn() method.

Idea 1 would probably the more generic solution as it works for URLs and URNs. On the other hand, idea 2 would make URNs stick out.

Abdujabbar commented 1 month ago

@tomschr I did a little research and found this list of URI schemas: https://en.wikipedia.org/wiki/List_of_URI_schemes

Based on this list, schemas=['urn'] is not the right way because it's not in the list of URI schemas.

I think it would be better to have a new method .urn().

It's just opinion.

tomschr commented 1 month ago

@tomschr I did a little research and found this list of URI schemas: https://en.wikipedia.org/wiki/List_of_URI_schemes

Based on this list, schemas=['urn'] is not the right way because it's not in the list of URI schemas.

Ahh, that's interesting. I didn't know, thanks!

I think it would be better to have a new method .urn().

It's just opinion.

I'm fine with using .urn() as the way to go. 👍

My only concern is that users will find it. So one way to ensure that could be:

Abdujabbar commented 1 month ago

@tomschr

In this case, it would be great to hear @fcurella opinion also.