Monorepo - Githubissues

berekuk commented 2 years ago

PR for #4.

Due to the nature of this PR it wasn't really possible to come up with a clear diff, sorry.

I tried to keep the changes to the minimum whenever possible, but sometimes it wasn't practical, so I opted for documenting everything I changed instead.

Code layout:

Code is in src/ to avoid too many root-level dirs
Some files are necessary in root, e.g. tailwind & postcss configs, but the number is minimal, so it shouldn't be a problem (I have some previous experience with a similar setup)
Backend in src/backend/, frontend in src/web/; I'd like to refactor this further later

JS/TS setup:

I settled on npm instead of yarn (both were used), there are some potential issues with using both, and I think npm is good enough these days unless proven otherwise
I added some basic and very forgivable typescript configuration; mostly due to "type":"module" issues in package.json which made the frontend & backend code not very compatible in the monorepo setup
- This could be fixed in other ways but ESM mess is so annoying that I didn't want to deal with it twice (not my area of expertise, tbh)
- I converted some other files to ts/tsx, mostly to help with minor refactorings (I'll convert the rest in a separate PR later)
Backend CLI can now be invoked with ts-node src/backend/index.js, or with npm run cli which wraps the same command

Basic coding style:

I prettier-ized everything (because I edited some code and my vscode kept reformatting it and because this PR is a mess anyway)

Configuration:

As far as I could figure out, some conf were in .env and some were in input/secrets.json; I believe .env is enough and I'm moving in that direction, but I haven't reviewed all the code that uses getSecret yet
Next.js recommends using .env (and other files like .env.production) for env and secrets too
For Netlify we can set up all secrets in its web admin (I'll check this later, see "Stuff that's not ready yet" below)
There's an issue with CLI scripts (entry point is now in src/backend/index.js) that they don't pull .env vars by default; I set up dotenv for this and imported it in src/backend/index.js and src/backend/manual/manualDownload.js but there are probably other one-time CLI scripts which need it too

New:

I added POSTGRES_NO_SSL=1 option for local postgres (it's used in src/backend/database/pg-wrapper.js)

API:

I reintegrated API code to src/pages/api, as we discussed
Most methods are intact, some are renamed since you said API is not stable yet & it's not a big deal; I consider it all temporary anyway since we'll move in GraphQL direction soon
I removed /api/forecasts-by-id and getDashboardForecasts, I believe it was superseded by /api/dashboard-by-id (correct me if got this wrong, please)
I dropped duplicate code from api (pg-wrapper, mongo-wrapper), these were somewhat different from their backend versions.
- Might be important! I haven't read the code carefully but assumed that backend version is correct/good enough, please let me know if I'm wrong here.
I haven't enabled CORS to API endpoints. I can add it if necessary, but api in this branch on /api on the same domain, so it works anyway.

New frontpage setup:

So, Netlify is serverless, and json files won't work there; so I had to create a new DB table latest.frontpage which just stores frontpage_full and frontpage_sliced columns
Also, Netlify doesn't have cron; /api/cron/update-frontpage is now responsible for populating the DB, and it can be called from github actions (see: https://vercel.com/docs/concepts/solutions/cron-jobs; this doc is from Vercel but should be identical on Netlify)
Github actions are not set up properly yet, but otherwise it works fine
- TODO: add auth token for /api/cron/update-frontpage so that we don't get DDOSed (will do this before merge if we decide to go with this approach)
BTW, I'm not even sure if all this is needed; if I guessed correctly that frontpage_sliced.json is just a performance optimization, then similar effect can be achieved with getStaticProps and incremental static regeneration
- But there was a log of all frontpage changes in a file so I assumed you considered it useful for something, so I repeated it as is - the new latest.frontpage table is append-only and I don't remove old data

Cleanups:

I removed lib/data dir from the frontend (it wasn't used anywhere, please let me know if I shouldn't have done this)

Minor fixes:

Initialized documentContents to [] in mongoRead (it failed on an empty DB otherwise)
Added a call to pgInitializeScaffolding in pg-wrapper
Sometimes in the process of converting files from js to ts I stumbled on bugs such as passing useless className props to custom React components; I just fixed those as I went

Stuff that's not ready yet:

Env configuration; I'm in the process of setting up a separate netlify deployment to check that everything works properly, and will finish getSecret refactoring in further commits.
New README; old README files are stale and I'll have to rewrite it significantly before we merge this.
Cron setup; I have a plan on how netlify + DO database + github actions for cron would be enough for everything, it might be too radical for you, though (no CLI! no local json files!), not sure yet, we can settle on other setups if you don't like this one :) will describe it in more details soon.

NunoSempere commented 2 years ago

This seems good!

Some comments here:

I dropped duplicate code from api (pg-wrapper, mongo-wrapper), these were somewhat different from their backend versions.

Might be important! I haven't read the code carefully but assumed that backend version is correct/good enough, please let me know if I'm wrong here.

Yeah, I think these were a bit different; the frontend version had a bit more error checking to ensure that the webpage doesn't break down completely in case of a database error. Should be relatively easy to redo.

I can imagine CORS not being disabled being a problem if someone wants to query metaforecast endpoints from a different domain.

The frontpage pages were partly for optimization reasons, but mainly for debugging the crontab which wasn't working. Seems fine to delete them.

lib/data was not used, iirc

Re:cron, You are right that I would have used something less radical, but I'm ok with you using Github actions and other black magic beyond my comfort zone :)

Also, I think there are some conflicts on metaculus-fetch function; they were throttling the requests. This shouldn't be too hard to fix.

NunoSempere commented 2 years ago

PS: If I merge this into master, the heroku repo in charge of the repo will break. This is not that much of a problem, since I can continue updating the pg database using my local branch for a few days.

berekuk commented 2 years ago

Yeah, I think these were a bit different; the frontend version had a bit more error checking to ensure that the webpage doesn't break down completely in case of a database error. Should be relatively easy to redo.

Ok, I'll re-read that code more carefully then. (Though I don't like the current approach to error checking, but that's a different topic which we can discuss later)

The frontpage pages were partly for optimization reasons, but mainly for debugging the crontab which wasn't working. > Seems fine to delete them.

Cool, I'll look into whether I can rewrite it without a caching table (which seems like an overkill, tbh).

Re:cron, You are right that I would have used something less radical, but I'm ok with you using Github actions and other black magic beyond my comfort zone :)

The key question here is whether we want CLI and files for one-shot tasks.

I think both approaches (Netlify & manually managed VPS) are viable, but there are tradeoffs, I'll write them down tomorrow.

Also, I think there are some conflicts on metaculus-fetch function; they were throttling the requests. This shouldn't be too hard to fix.

Oops, I tried to rebase on fresh master but messed up, will fix.

PS: If I merge this into master, the heroku repo in charge of the repo will break. This is not that much of a problem, since I can continue updating the pg database using my local branch for a few days.

Yeah, don't merge it yet! I plan to push further changes to this branch for one or two days until it's ready.

berekuk commented 2 years ago

Yeah, I think these were a bit different; the frontend version had a bit more error checking to ensure that the webpage doesn't break down completely in case of a database error. Should be relatively easy to redo.

Hmm. The only bit I managed to find is extra checks in pgGetByIds, but in case of an error (empty ids list) it returns a nonsensical string which probably triggers an exception down the road anyway, so that's inconsequential.

(Btw, I'm pretty sure pgGetByIds's code is vulnerable to SQL injections...)

berekuk commented 2 years ago

Ok, here goes, another set of commits.

Env configuration; I'm in the process of setting up a separate netlify deployment to check that everything works properly, and will finish getSecret refactoring in further commits.

I removed getSecret code entirely and documented env configuration in https://github.com/berekuk/metaforecast-backend/blob/monorepo/docs/configuration.md.

If you had the local secrets.json file, you'll have to replace it with .env file.

We'll need to make sure all of env variables are set on Heroku & Netlify (the set of required variables is technically not identical, but I think it'll be easier to just push the same set to both services). I think most of vars are already set on Heroku, and none are set on Netlify. Also, NEXT_PUBLIC_SITE_URL is new and should be set on Netlify too.

New README; old README files are stale and I'll have to rewrite it significantly before we merge this.

Yep, https://github.com/berekuk/metaforecast-backend/blob/monorepo/README.md

Cron setup

As we discussed in #5, we don't actually want to go with Github Actions and Netlify background functions, so we don't really need to change anything.

Minor improvements and changes:

npm run dbshell now connects to postgres
added netlify-cli to package.json (check out netlify env for easier configuration)
cleaned up almost all of data/ and input/ files
cleanup up some old commented code

About the new frontpage code.

My first set of commits moved frontpage data from json files to latest.frontpage; today I rewrote all of that to direct SQL just-in-time queries (see https://github.com/berekuk/metaforecast-backend/blob/monorepo/src/backend/frontpage.ts).

To avoid the performance degradation of the front page (ORDER BY RANDOM() is kinda slow), I spent a few hours refactoring the commonDisplay and a few related files.

The changes are significant; @NunoSempere, you might want to do a code review just to get acquainted with the new version and not be surprised in the future.

The main change is that / and /capture now use getStaticProps with revalidate option; this speeds up the initial page load significantly:

azrael@azrael:~$ time curl -s 'https://shimmering-bavarois-62a68f.netlify.app/' >/dev/null

real    0m0.029s
user    0m0.020s
sys 0m0.003s
azrael@azrael:~$ time curl -s 'https://metaforecast.org' >/dev/null

real    0m2.030s
user    0m0.015s
sys 0m0.011s

(https://shimmering-bavarois-62a68f.netlify.app/ is my test instance on netlify; it works on top of incomplete database, but it shouldn't matter)

This means that all search requests except for static frontpage data are performed on client side.

I also did a few cleanups:

got rid of complex hasDisplayBeenCaptured-related code with one useEffect call (see https://github.com/berekuk/metaforecast-backend/blob/monorepo/src/web/display/displayOneForecastForCapture.tsx#L179)
rewrote some logic in commonDisplay with useEffect
fixed a few bugs, but didn't kept the track of everything, sorry
- also, useEffect should fix a few race conditions by itself (related: https://overreacted.io/how-are-function-components-different-from-classes/)
simplified JSX code
added more typescript checks
rewrote history.replaceState with router.replace
submitted a few issues (#11, #12, #13; first of these is now fixed due to router.replace refactoring)

Final thoughts on merging this branch.

Here's what has to be done for proper deployment:

configure env vars on Heroku and Netlify
update heroku scheduler script path (and we'll have to either build typescript code to /out dir with tsc, or install ts-node in production)
merge & check

It might be easier/safer if I did the final deployment, I believe I have all necessary credentials for this, let me know if you're ok with this.

NunoSempere commented 2 years ago

The main change is that / and /capture now use getStaticProps with revalidate option; this speeds up the initial page load significantly

Not sure I like these new changes.

First, curl is just fetching the html, but it's not loading the javascript. So I'm not sure it's measuring what we care about (time until website shows useful things to the user?)

If you test https://metaforecast.org/?query=test vs https://shimmering-bavarois-62a68f.netlify.app/?query=test, the second is much more annoying; there are three react loads between initial page load and results being shown.

Similarly, because you're using staticprops instead of serverside props, the frontpage is also fetched after initial page load, and there is some initial flashing (??).

All in all, the previous approach seems like it was faster, because:

When loading the frontpage, I was fetching a json, rather than using the really expensive ORDER BY RANDOM call
When loading from a query saved in the url, the server side took as long as the frontend does now, but returned a page which was already ~rendered/didn't have to refresh.

I think that to save the previous approach with the current structure, we could:

have a latest.frontpage_active with just 50 elements, instead of a json, or instead of doing the really expensive random query. Its contents could be changed every day/few hours by a cron job.
go back to using getServerSideProps to avoid initial flashing, even if that means that the initial page load as measured by curl is slower.

Thoughts here? I personally really dislike multiple reloads until content is shown, but this is not necessarily universal.

NunoSempere commented 2 years ago

Excepting the above details with regards to the frontpage, I'm really liking how this is going. I'm going through the code, and it looks like a different, less amateurish and better organized beast.

berekuk commented 2 years ago

there is some initial flashing (??).

This can (and should) be fixed, flashing is due to scrollbar shrinking the page, I planned to look into that anyway, should be fixable with better CSS (the same flashing issue exists on the current website).

When loading the frontpage, I was fetching a json, rather than using the really expensive ORDER BY RANDOM call

No, my version doesn't pay ORDER BY RANDOM costs at all! getStaticProps always prepares props in the background, even when revalidation is enabled.

Overall, for the frontpage without search query, the costs in the new version are:

fetching static HTML
react hydration of the initial page without data (since we can't prerender frontpage content on the server side)
time to render the full version

While in the old version the costs are:

fetching non-static HTML (slower than static, since Next.js has to render JSX on every call)
(but you don't have to wait for the hydration on the frontend to see the data)

And for the version without search query, new version is faster on time-to-first-byte, but slower on fetching the data since it requests algolia only after hydration.

https://pagespeed.web.dev/report?url=https%3A%2F%2Fmetaforecast.org%2F vs https://pagespeed.web.dev/report?url=https%3A%2F%2Fshimmering-bavarois-62a68f.netlify.app%2F are inconclusive: sometimes old version is faster, sometimes the new one.

(I still like that static version shows the shell with top menu much faster, though.)

Overall, I agree that the gains are not clear, so I'll roll it back, but I'd like to think more about the alternatives later.

I have an idea of splitting search into /search page, then maybe we could have a fast static fully SSR frontpage and slightly slower /search page... but unfortunately frontpage should be filterable on stars and platforms, and if we serve all filters from /search, then we'd still need to pull frontpage.json data into /search, which means we'd still need to cache it in the DB for the optimal performance. This would speed up the default / page, at least, and so might be worthwhile (static fully prerendered pages are much better than anything else, obviously, it's just that the query gets in the way).

There might also be a solution in the direction of "let's statically render everything on demand" (Next.js is capable of doing that with getStaticPaths + fallback=blocking), but it would require different URL schemas, like https://metaforecast.org/search/stars/2/forecasts/100/display/20/platforms/Foretold,PolyMarket; URL schemas could be hidden by rewrites, though... overall this approach seems too complicated to me. But I'll think about it some more and maybe figure out a better way (later; don't want to spend too much time on this and delay merging this branch).

I personally really dislike multiple reloads until content is shown, but this is not necessarily universal.

In my experience, static app shells (initial half-empty pages, sometimes with gray boxes instead of content) can be annoying, but can also be good and invisible. Reload shouldn't be noticeable to the end user, and when it's not noticeable, there's not much difference between the classic old-school "browser rendered the first half of html" and app shells from cache.

PS: I just noticed that I missed Next.js 12.1 a month ago, turns out streaming SSR has arrived, this is going to be great for performance and possibly enough even without going full static.

berekuk commented 2 years ago

I'll roll it back

Done.

"Rebuild database" job is now #20 in CLI (not included in doEverything), we can put it on Heroku as a separate job.

flashing is due to scrollbar shrinking the page

I fixed this with permanent vertical scrollbar. It's not a perfect solution but still better than jumping content, and all approaches I could find have some downsides.

Some minor refactorings:

I placed some files in web/search/ (anySearchPage.ts with common getServerSideProps code for index and capture, to fix copy-paste, and I moved commonDisplay.tsx there also since it made sense to locate these together)
this slightly broke styles due to too-specific tailwind config
so I gave all React-related files .tsx extensions and changed the config to ["./src/web/**/*.tsx", "./src/pages/**/*.tsx"]
which required some more cleanups due to typescript having stricter requirements for ts files, but mostly minor.

Overall I think we're ready to deploy this in prod, let me know if it's ok if I do it tomorrow!

NunoSempere commented 2 years ago

Hey, I'm still getting the flashing in your test site (though it's possible that you didn't rebuild it). I've made an issue to discuss that in particular: https://github.com/QURIresearch/metaforecast-backend/issues/14

netlify[bot] commented 2 years ago

Deploy Preview for metaforecast ready!

Name	Link
Latest commit	118495c30c214e8e88d3929a2610837ab9d79637
Latest deploy log	https://app.netlify.com/sites/metaforecast/deploys/623f8ec55ae4fc00089a858d
Deploy Preview	https://deploy-preview-8--metaforecast.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

berekuk commented 2 years ago

Netlify logs:

Your environment variables exceed the 4KB limit imposed by AWS Lambda. Please consider reducing them.

Ugh. This is annoying and it means that I'll have to figure out a way to inject a .env file to Netlify on the build step, instead of just configuring env variables through Netlify.

berekuk commented 2 years ago

Merged and deployed!

I set up the frontpage cron job to run every hour and ran both cron jobs by hand, everything seems fine.

The issue with env limit is still important, for now I worked around it by trimming cookies and removing some unused ones (e.g. CSETFORETELL_COOKIE) so that the entire config fits in 3kb, but this won't work in the long run.

I'll file a separate issue on this and continue from there.

quantified-uncertainty / metaforecast

Monorepo #8

Deploy Preview for metaforecast ready!

quantified-uncertainty / metaforecast

Monorepo #8

✅ Deploy Preview for metaforecast ready!

Deploy Preview for metaforecast ready!