Open bomjastik opened 5 years ago
Is this happening with v2.2.0, did you try out if this also happens for you on 2.1.0? Thanks!
Hi, @Haroenv I've just started to implement this too and I currently have 2,159 results but I'm only getting 2000 URLs in sitemap.0.xml
?
do you have multiple sitemap files @alexpchin? Does this also happen with appId: 'latency', indexName: 'instant_search', apiKey: '8baa3791b46c2a52259007747e743e07'
?
Hi @Haroenv, I've just checked with:
algoliaSitemap({
algoliaConfig: {
appId: "latency",
apiKey: "8baa3791b46c2a52259007747e743e07",
indexName: "instant_search",
},
hitToParams: ({ objectID }) => {
return {
loc: `https://someurl.com/${objectID}`,
lastmod: new Date().toISOString(),
priority: 0.7,
}
},
}),
Only one sitemaps/sitemap-index.xml
and sitemaps/sitemap.0.xml
are created. Inside sitemaps/sitemap.0.xml
there are 21,000 URLs.
const aggregator = async (args) => {
let { hits, cursor } = args;
do {
if (!hits) {
return;
}
batch = batch.concat(
hits.reduce((entries, hit) => {
const entry = hitToParams(hit);
return entry ? entries.concat(entry) : entries;
}, [])
);
if (batch.length > CHUNK_SIZE) {
await flush();
}
({ hits, cursor } = await index.browseFrom(cursor));
**console.log('hits -->', hits.length)**
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 1000
// hits ---> 469
} while (cursor);
**console.log('batch ---> is 21,000', batch.length);**
await handleSitemap(batch);
const sitemapIndex = createSitemapindex(sitemaps);
await saveSiteMap({
sitemap: sitemapIndex,
root: outputFolder,
filename: 'sitemap-index',
});
};
*Aside, at some point we could potentially add a check if the output folder hasn't already been created and create it?
Thanks for checking, that indeed doesn't look right! Maybe the last batch doesn't have a cursor, and therefore the loop stops. We should switch it to a regular while loop and do the fetching the beginning of the loop to avoid it I think. The program flow will be much simpler
for your aside, that definitely makes sense I think mkdir
now has a -p
option in node that we can use
Hi @Haroenv, sorry, yes I was going to follow this up with a suggestion for a solution as the problem is with the cursor but I was eating some noodles...
We could just solve quickly by using a run
flag that updates at the start of the while loop if there is no curser?
const aggregator = async (args) => {
let { hits, cursor } = args;
let run = true;
do {
run = !!cursor;
if (!hits) {
return;
}
batch = batch.concat(
hits.reduce((entries, hit) => {
const entry = hitToParams(hit);
return entry ? entries.concat(entry) : entries;
}, [])
);
if (batch.length > CHUNK_SIZE) {
await flush();
}
({ hits, cursor } = await index.browseFrom(cursor));
} while (run);
await handleSitemap(batch);
const sitemapIndex = createSitemapindex(sitemaps);
await saveSiteMap({
sitemap: sitemapIndex,
root: outputFolder,
filename: 'sitemap-index',
});
};
return index.browse(params).then(aggregator);
}
This would work but arguably a little more confusing in terms of flow.
I think that makes sense, but a regular while with the fetching first will be simpler probably :)
@Haroenv Did you want me to PR? I saw that there is also a number of chores outstanding too?
All PRs that were open were for devDependencies, so not urgent for this project. A PR would be well appreciated if you confirm that this is the source of the bug. Thanks @alexpchin !
@Haroenv Hi was there any follow up on this issue? I have started prototyping an implementation using this library but this bug is a blocker for now
Does the fix suggested work for you @Pr00xxy?
Hello.
I'm trying to create the site map and everything works well besides one thing.
When I'm trying to create the site map from the index with 29,248 rows, I'm getting 29,000 result rows in the site map. The exact 29,000.
I have tried to do the same for another index with 3,291 rows, and I have got the exact 3000 result rows in it.
For smaller indexes with less than 100 rows, I'm getting correct result rows number.
Is there any reason why my results rounding may happen?
My index.js:
Regards.