Closed bskinn closed 2 years ago
I'm seeing something that seems to support Brian's suggestion:
somehow, author metadata information is being requested from some object(s) that will hold it in the future, but that have not yet been populated at the time of these errors
I created PR #352 to add some logging when author.content
is undefined, and what you can see in the build that fails for quansight-consulting-published is that the author field points to some kind of SHA, for example 5c56c772-bc6d-4e3d-aae9-e5d202fcf70a.
IIRC, aren't JavaScript Promises represented by SHAs/UIDs(?) of that sort?
I looked at the Next.js code yesterday for the GetStaticProps
type, and noticed the return type is Promise<GetStaticPropsResult<P>> | GetStaticPropsResult<P>
, so it seems to that this would fit the hypothesis also.
Ok, whatever that hash is, it's not pointing to a Promise -- this console.log()
call doesn't ever seem to be hit. Seems it's just a string UID of some kind, returned instead of the actual author data.
Two further confusing things:
In that same build, the articles/authors that hit this logging call aren't the same as the articles where the TypeError
occurred:
TypeError
(six occurrences):
It appears that rendering of each library tile requires scanning across all of the blog posts, and for any given post where the TypeError
occurs, it may stem from a failure to correctly populate author
in any post on the site?
What site is this for?
Quansight LLC
Expected behavior
No response
Actual behavior
On some builds of the LLC website, the build fails with
TypeError: Cannot read property 'firstName' of undefined
errors for some, but not all, published blog posts. Here are three example failing builds: one, two, three.The error/failure is not deterministic. Redeploying a build that fails from this error can sometimes succeed in the future, though it will often take 5+ redeployments before a successful build.
For reference, a full traceback for a representative error:
Originally, I thought there was something wrong with a specific team member in
/team
. In the leadup to launch, when I was in the course of trying to publish all of the migrated blog posts, the build error would occur whenever I had either of the two posts by Adam Lewis, "Panel/Holoviews Learning Aid" and/or "Spatial Filtering at Scale with Dask and Spatialpandas", set in the Published state. If both were Unpublished, the build would consistently succeed.To emphasize again: the
TypeError
would occur on more posts than just these two posts I thought were problematic.This assumption, that Adam Lewis's author entry was the problem, was reinforced when I switched the author of one of these posts to Dharhas, set that post to Published, and observed successful builds. If I then switched back to Adam Lewis as author, the build failure would recur.
Later on, after @kherma's fix for #310 was implemented, the same build errors started to occasionally occur again, even with Adam Lewis's posts set as Unpublished. I noted this in https://github.com/Quansight/Quansight-website/pull/310#issuecomment-1174962115 and the following conversation. It happened infrequently enough, though, that I decided it wasn't worth trying to fix before launch -- I would just manually kick off redeploys as needed until I got a successful build.
One final observation: as best I know, these build errors ONLY occur on builds that are configured to use only Published Storyblok content --
-staging
builds triggered on the exact same Github code and Storyblok content, which pick up both Published and Unpublished content, do not experience this error. As an example, see the following two builds, from one point in the midstream of @gabalafou's diagnosis efforts for the problem (commit 78e192e):quansight-consulting-published
build -- only Published content, error occursquansight-consulting-staging
build -- both Published and Unpublished content, NO error occursThe non-determinism here makes me think that this may be a race condition in the pre-rendering step: somehow, author metadata information is being requested from some object(s) that will hold it in the future, but that have not yet been populated at the time of these errors. Given the appearance of
getLibraryTiles
in the traceback, it seems likely that the problem is coming from the library-population code for/library
, where the page collects the information it needs in order to show the grid of library items.I'm at a loss for why the error would only occur for builds with only Published content. Perhaps there's something in the logic where the Next.js code is querying the
/post
items for (Un)published status that is contributing to a possible race condition? Or triggering early/late population of the author metadata?If it is a race condition, perhaps putting some sort of retryer or
setTimeout()
on the call togetAuthorName()
ingetBlogArticlesProps.ts
, and possibly also to the similar call ingetLibraryLinksProps.ts
, might help? These would probably only be band-aids, though, not solutions.How to Reproduce the problem?
No response
Anything else?
No response