fedeya / remix-sitemap

Sitemap generator for Remix applications
https://npmjs.com/remix-sitemap
MIT License
99 stars 6 forks source link

Static sitemap generation losing URL before a slug #43

Closed AjaniBilby closed 1 year ago

AjaniBilby commented 1 year ago

Describe the bug When I run remix-sitemap it does correctly identify all of the routes:

🔍 Found routes: root, routes/admin.transaction.$transactionID.card.delete, routes/admin.transaction.$transactionID.card.scrub, routes/admin.transaction.$transactionID.delete, routes/admin.account_.$accountID.transactions, routes/dictionary_.$dictID_.edit.bulk-import, routes/org.$orgID.profile.$profileID.edit, routes/admin.transaction.$transactionID, routes/dictionary_.$dictID_.edit.delete, routes/list_.view.$listID_.edit_.delete, routes/admin.account.$accountID.delete, routes/dictionary_.$dictID_.edit.signs, routes/resource_.$resID_.edit.category, routes/resource_.$resID_.edit.language, routes/admin.analytic.sign.individual, routes/category_.$catID_.edit_.delete, routes/org.$orgID.group.$groupID.edit, routes/org.$orgID.profile.bulk-delete, routes/sign.$signID_.edit_.dictionary, routes/admin.region.$regionID.delete, routes/login.profile.$organisationID, routes/resource_.$resID_.edit.signs, routes/admin.analytic.sign.weekday, routes/admin.vocab.$vocabID.delete, routes/org.$orgID.profile.qr-codes, routes/sign.$signID_.edit.category, routes/sign.$signID_.edit.language, routes/sign.$signID_.edit_.keyword, routes/dictionary_.$dictID._index, routes/subscribe.organisation.new, routes/transaction.$transactionID, routes/admin.account_.$accountID, routes/api.asp.sign.$signID.poll, routes/dictionary_.$dictID.signs, routes/dictionary_.$dictID_.edit, routes/game_.$catID.select.$mode, routes/sign.$signID_.edit.delete, routes/sign.$signID_.edit.region, routes/admin.keyword.$keywordID, routes/admin.sign.missing.$type, routes/list_.view.$listID_.edit, routes/account.change-password, routes/list_.pdf.custom-poster, routes/about.babysign.product, routes/about.terms-conditions, routes/admin.region.$regionID, routes/category_.$catID_.edit, routes/org.$orgID.transaction, routes/quiz_.$id_.edit.delete, routes/resource_.$resID_.edit, routes/about.babysign._index, routes/explore_.sign.popular, routes/explore_.sign.updated, routes/sitemap-dynamic[.]xml, routes/about.babysign.signs, routes/about.babysign.songs, routes/admin.keyword._index, routes/admin.vocab.$vocabID, routes/explore_.sign.random, routes/list_.pdf.dictionary, routes/list_.pdf.flash-card, routes/list_.pdf.quiz-sheet, routes/login.profile._index, routes/account.transaction, routes/admin.analytic.list, routes/admin.analytic.sign, routes/dictionary_.$dictID, routes/about.dictionaries, routes/admin.organisation, routes/game_.$catID.match, routes/list.custom-poster, routes/list_.pdf.matching, routes/list_.view.$listID, routes/org.$orgID.manager, routes/org.$orgID.profile, routes/sign.$signID_.edit, routes/subscribe.personal, routes/account_.retrieve, routes/admin.transaction, routes/api.random.phrase, routes/explore_.sign.new, routes/list_.pdf.dominos, routes/admin.dictionary, routes/org.$orgID.admin, routes/org.$orgID.group, routes/resource_.$resID, routes/subscribe._index, routes/category.$catID, routes/category._index, routes/choose-language, routes/list.dictionary, routes/list.flash-card, routes/list.quiz-sheet, routes/quiz_.$id_.edit, routes/admin.category, routes/admin.sign-old, routes/login.personal, routes/about.keysign, routes/admin.account, routes/admin.keyword, routes/explore_.sign, routes/list.matching, routes/login.profile, routes/about.auslan, routes/about._index, routes/account.edit, routes/admin.region, routes/api.category, routes/list.dominos, routes/login._index, routes/sign.$signID, routes/admin.vocab, routes/api.keyword, routes/healthcheck, routes/list._index, routes/about.font, routes/admin.list, routes/admin.quiz, routes/admin.sign, routes/dictionary, routes/org.$orgID, routes/org._index, routes/util_.beta, routes/quiz_.$id, routes/subscribe, routes/api.sign, routes/browse.$, routes/category, routes/resource, routes/account, routes/contact, routes/desktop, routes/profile, routes/logout, routes/search, routes/_index, routes/admin, routes/cdn.$, routes/login, routes/eula, routes/game, routes/join, routes/list, routes/api, routes/org, routes/pwa

However it generates a lot of bogus data where it's clearly missing the url before the slug.
i.e.

<url>
    <loc>https://beta.signplanet.net/card/delete</loc>
    <lastmod>2023-08-28T00:53:25.534Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
</url>
<url>
    <loc>https://beta.signplanet.net/card/scrub</loc>
    <lastmod>2023-08-28T00:53:25.534Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
</url>
admin.transaction.$transactionID.card.delete.tsx
admin.transaction.$transactionID.card.scrub.tsx

It just so happens these two examples are ones I should mask out of the sitemap, however there are also many others, it appears to just lose all information about the URL before any slug, and only uses the trailing most portion.

Also in this project I haven't setup any SitemapFunctions yet, as I was just trying to get static urls sitemapped first to ensure I even want to use this library.

Ideally any url with a slug which doesn't have a SitemapFunction binding should be ommited from the sitemap.

remix-sitemap.config.js

/** @type {import('remix-sitemap').Config} */
module.exports = {
    siteUrl: "https://beta.signplanet.net",
    generateRobotsTxt: false
}

Environment:

fedeya commented 1 year ago

This is too strange, i'm already filtering the dynamic routes without a SitemapFunction here

https://github.com/fedeya/remix-sitemap/blob/de457eefd4b2dbc4546f9db8db7653f73bc90c1c/src/utils/validations.ts#L39

I also tried to add the admin.transaction.$tr.. routes, in a new project and it works as expected.

Maybe your route manifest is different for a specific adapter or something, can you provide a repo with the error?

Thanks!

AjaniBilby commented 1 year ago

If you clone this and run npx remix-sitemap in the root directory you will get the behaviour I was referring to.

It generates these <loc>s which don't exist:

fedeya commented 1 year ago

Now i found the issue, it seems that the internal path in the manifest is nested if any parent route exist.

With that repo if i remove dictionary_.$dictID.tsx the children route signs works as expected but with it i don't get the complete path (/dictionary/:dictID/signs) i just get signs and it doesn't match the condition, same goes for dictionary_.$dictID.edit.tsx.

I'm going to try moving the validation to check the internal path id, it seems to always be the same.

fedeya commented 1 year ago

Hi @AjaniBilby this should be fixed in the latest version (2.2.4), can you try again? thanks

AjaniBilby commented 1 year ago
npx remix-sitemap
🔍 Found config file: C:\Users\me\Documents\GitHub\signplanet\remix-sitemap.config.js
node:internal/errors:496
    ErrorCaptureStackTrace(err);
    ^

Error [ERR_UNSUPPORTED_ESM_URL_SCHEME]: Only URLs with a scheme in: file, data, and node are supported by the default ESM loader. On Windows, absolute paths must be valid file:// URLs. Received protocol 'c:'
    at new NodeError (node:internal/errors:405:5)
    at throwIfUnsupportedURLScheme (node:internal/modules/esm/load:133:11)
    at defaultLoad (node:internal/modules/esm/load:84:3)
    at DefaultModuleLoader.load (node:internal/modules/esm/loader:281:26)
    at DefaultModuleLoader.moduleProvider (node:internal/modules/esm/loader:192:22)
    at new ModuleJob (node:internal/modules/esm/module_job:63:26)
    at #createModuleJob (node:internal/modules/esm/loader:216:17)
    at DefaultModuleLoader.getJobFromResolveResult (node:internal/modules/esm/loader:169:34)
    at DefaultModuleLoader.getModuleJob (node:internal/modules/esm/loader:154:17)
    at DefaultModuleLoader.import (node:internal/modules/esm/loader:245:28) {
  code: 'ERR_UNSUPPORTED_ESM_URL_SCHEME'
}

Node.js v20.3.1
fedeya commented 1 year ago

Sorry, are you using the version 2.2.4? the remix-sitemap command should be common js not esm

AjaniBilby commented 1 year ago

Theoretically it is using that version, but you don't seem to have a --version command, and npx can behave in weird ways sometimes

cat node_modules\remix-sitemap\package.json
{
  "name": "remix-sitemap",
  "version": "2.2.4",
fedeya commented 1 year ago

Ok, i can't debug this propertly because looks like a specific windows error but i guess is an error with the import of the config file

i merged this https://github.com/fedeya/remix-sitemap/pull/48 to try to fix it and i released a new version (2.2.5) can you try again with it? sorry

AjaniBilby commented 1 year ago

2.2.5 works, except for some reason there are some routes like /login/undefined in the sitemap which I haven't got time to attempt to replicate right now

fedeya commented 1 year ago

Oh yeah i forgot to handle the index nested route case, i guess i fixed it in a new release (2.2.6), could you try again?

AjaniBilby commented 1 year ago

This now works completely including successful dynamic route generation, and also exclusions are correctly applying.

Thank you 🫡