gratipay / gratipay.com

Here lieth a pioneer in open source sustainability. RIP
https://gratipay.news/the-end-cbfba8f50981
MIT License
1.12k stars 308 forks source link

Broken URL links for npm packages with special characters? #4496

Closed rohitpaulk closed 7 years ago

rohitpaulk commented 7 years ago

From @calvinhp on Slack:

I just found a 404 :slightly_smiling_face: If I search for Plone, the first matching project is @plone/plone-react but it is a 404

Search results:

screen shot 2017-06-01 at 8 00 50 pm

Broken Link: https://gratipay.com/on/npm/@plone/plone-react/

I tried URL encoding it to be https://gratipay.com/on/npm/%40plone%2Fplone-react/, didn't work. I've encountered similar problems with common routing libraries where a / can't be URL-encoded and placed in the URL, although the spec supports it - might be related.

chadwhitacre commented 7 years ago

Blorg. At a minimum I guess we need to support these under /on/npm/. We don't allow forward slash in Gratipay project slugs (for, ahem, obvious reasons). So what are we going to do with projects claimed from these packages?

chadwhitacre commented 7 years ago

We also don't allow @.

chadwhitacre commented 7 years ago

https://www.npmjs.com/package/@plone/plone-react/

It looks to me like this is a namespacing hack in npm. We plan to implement namespacing directly (#4246), at which point the obvious thing to do is undo npm's hack and simply name the project plone-react. We could do the same now against the promise of #4246.

chadwhitacre commented 7 years ago

I think it's safer to consume the remainder of the string under /on/npm/ since we're not expecting to use that URL space for other things.

chadwhitacre commented 7 years ago

Can npm packages have / as the final character in their name? If so we've got a trailing slash race.

chadwhitacre commented 7 years ago

simply name the project plone-react

That's what they're already doing on GitHub anyway: https://github.com/plone/plone-react.

chadwhitacre commented 7 years ago

If so we've got a trailing slash race.

I guess we can do an extra db query if we don't find a match under .../ or vice-versa.

chadwhitacre commented 7 years ago

Though the trailing slash case should be so rare that we should always try the non-trailing db query first I think. Much more likely to send /on/npm/foo/ and want /foo than the other way around.

chadwhitacre commented 7 years ago

Docs: https://docs.npmjs.com/misc/scope. Maybe we don't have to worry about trailing slashes? Is there something interesting to be done with scope endpoints? They don't.

chadwhitacre commented 7 years ago

I think the name itself can't include a /:

The name ends up being part of a URL, an argument on the command line, and a folder name. Therefore, the name can't contain any non-URL-safe characters.

https://docs.npmjs.com/files/package.json

chadwhitacre commented 7 years ago

How are scoped packages stored in our packages table?

chadwhitacre commented 7 years ago

With the scope in the name. 👍

gratipay::DATABASE=> select * from packages where name like '@%' limit 10;
┌────────┬─────────────────┬──────────────────────────────────────────┬──────────────────────────────────
│   id   │ package_manager │                   name                   │                                 d
├────────┼─────────────────┼──────────────────────────────────────────┼──────────────────────────────────
│ 872909 │ npm             │ @zeke/scopey                             │ A package for testing npm scopes 
│ 873868 │ npm             │ @scdhhs/document-services                │ A collection of services that pro
│ 874294 │ npm             │ @lib/tsargs                              │ Adapter for using tsconfig.json f
│ 874527 │ npm             │ @asaayers/react-a11y                     │ Warns about potential accessibili
│ 874540 │ npm             │ @dmail/test                              │                                  
│ 874829 │ npm             │ @smikes/bletch                           │                                  
│ 874996 │ npm             │ @planeshifter/tweet-sentiment            │ SVM Classifier to Detect Sentimen
│ 875207 │ npm             │ @jasonfill/lambda-cloud-watch-log-parser │ Parses event data from cloud watc
│ 875344 │ npm             │ @stevegill/test-module                   │ testing out scoped modules       
│ 875568 │ npm             │ @netapps/netapps-session                 │ netapps-api session store for Exp
└────────┴─────────────────┴──────────────────────────────────────────┴──────────────────────────────────
(10 rows)
chadwhitacre commented 7 years ago

In that case we should add redirect from trailing slash to non-trailing slash just for on/npm/$package.spt. We'll never want a trailing slash as part of a name, but we don't want to 404 if we get one in the URL.

chadwhitacre commented 7 years ago

Moving to a PR in #4565 ...

chadwhitacre commented 7 years ago

Reopening for clean-up. When packages are claimed they get a name, and the name is wrong on some already claimed:

gratipay::DATABASE=> select id, name, slug, slug_lower from teams where slug like '@%';
┌─────┬─────────────────────────────┬─────────────────────────────┬─────────────────────────────┐
│ id  │            name             │            slug             │         slug_lower          │
├─────┼─────────────────────────────┼─────────────────────────────┼─────────────────────────────┤
│ 910 │ @sky-uk/eslint-config-sky   │ @sky-uk/eslint-config-sky   │ @sky-uk/eslint-config-sky   │
│ 808 │ @ciffi-js/cookies           │ @ciffi-js/cookies           │ @ciffi-js/cookies           │
│ 809 │ @ciffi-js/custom-select     │ @ciffi-js/custom-select     │ @ciffi-js/custom-select     │
│ 810 │ @ciffi-js/device            │ @ciffi-js/device            │ @ciffi-js/device            │
│ 811 │ @ciffi-js/responsive-images │ @ciffi-js/responsive-images │ @ciffi-js/responsive-images │
│ 812 │ @ciffi-js/router            │ @ciffi-js/router            │ @ciffi-js/router            │
│ 814 │ @ciffi-js/utils             │ @ciffi-js/utils             │ @ciffi-js/utils             │
└─────┴─────────────────────────────┴─────────────────────────────┴─────────────────────────────┘
(7 rows)

gratipay::DATABASE=>
chadwhitacre commented 7 years ago

slug_lower has a unique constraint on it.

BEGIN;
    UPDATE teams SET slug='eslint-config-sky', slug_lower='eslint-config-sky', name='eslint-config-sky'
     WHERE id=910;
    UPDATE teams SET slug='cookies', slug_lower='cookies', name='cookies'
     WHERE id=808;
    UPDATE teams SET slug='custom-select', slug_lower='custom-select', name='custom-select'
     WHERE id=809;
    UPDATE teams SET slug='device', slug_lower='device', name='device'
     WHERE id=810;
    UPDATE teams SET slug='responsive-images', slug_lower='responsive-images', name='responsive-images'
     WHERE id=811;
    UPDATE teams SET slug='router', slug_lower='router', name='router'
     WHERE id=812;
    UPDATE teams SET slug='utils', slug_lower='utils', name='utils'
     WHERE id=814;
END;
chadwhitacre commented 7 years ago
gratipay::DATABASE=> \i rename.sql 
BEGIN
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
COMMIT
gratipay::DATABASE=>
chadwhitacre commented 7 years ago

https://gratipay.com/eslint-config-sky https://gratipay.com/cookies https://gratipay.com/custom-select https://gratipay.com/device https://gratipay.com/responsive-images https://gratipay.com/router https://gratipay.com/utils

chadwhitacre commented 7 years ago

Actually, let's manually namespace ciffi ahead of #4246. utils is just too general to leave in the top-level namespace.

BEGIN;
    UPDATE teams SET slug='ciffi-cookies'
                   , slug_lower='ciffi-cookies'
                   , name='ciffi-cookies'
     WHERE id=808;
    UPDATE teams SET slug='ciffi-custom-select'
                   , slug_lower='ciffi-custom-select'
                   , name='ciffi-custom-select'
     WHERE id=809;
    UPDATE teams SET slug='ciffi-device'
                   , slug_lower='ciffi-device'
                   , name='ciffi-device'
     WHERE id=810;
    UPDATE teams SET slug='ciffi-responsive-images'
                   , slug_lower='ciffi-responsive-images'
                   , name='ciffi-responsive-images'
     WHERE id=811;
    UPDATE teams SET slug='ciffi-router'
                   , slug_lower='ciffi-router'
                   , name='ciffi-router'
     WHERE id=812;
    UPDATE teams SET slug='ciffi-utils'
                   , slug_lower='ciffi-utils'
                   , name='ciffi-utils'
     WHERE id=814;
END;
gratipay::DATABASE=> \i rename.sql
BEGIN
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
UPDATE 1
COMMIT
gratipay::DATABASE=>

https://gratipay.com/ciffi-cookies https://gratipay.com/ciffi-custom-select https://gratipay.com/ciffi-device https://gratipay.com/ciffi-responsive-images https://gratipay.com/ciffi-router https://gratipay.com/ciffi-utils