bird-team / brisbane-bird-atlas

Atlas of the Birds of Brisbane: Community bird atlas for Brisbane, Australia
https://brisbanebirds.com
GNU General Public License v3.0
3 stars 0 forks source link

Surveyor sheets not showing new species #129

Closed Louis-Backstrom closed 5 years ago

Louis-Backstrom commented 5 years ago

I tried to add a handful of new species to the atlas overnight (following the instructions on the wiki). That part seems to have worked - I've just pushed the edit to the repo so we'll see how it goes - but the asset building doesn't seem to have reflected the changes. The list of species at the end of the sheets does not show the new species I've added (I'm in the process of uploading the surveyor sheets manually at the moment, the push failed for them again, but I think the rest worked? New species have maps etc)

Any ideas?

jeffreyhanson commented 5 years ago

Yeah, it looks like the build has been failing since 4909ff3a6ff7d305d4a8c21792f08245c97d464b. Did you receive emails about this? I've configured my settings so that I receive emails when builds fail. Looking at the build log (https://circleci.com/gh/bird-team/brisbane-bird-atlas/415) and the commit (https://github.com/bird-team/brisbane-bird-atlas/commit/4909ff3a6ff7d305d4a8c21792f08245c97d464b), it looks like you added an entry into the bookdown YAML "_bookdown.yml" file for a new species page called "Fregetta-marina.Rmd" but didn't create a new Rmarkdown file called "Fregetta-marina.Rmd". Could you please try creating a new "Fregetta-marina.Rmd" and see if that fixes it?

jeffreyhanson commented 5 years ago

Also, you might have to remake the assets after comitting and pushing the new Rmarkdown file for the build to succeed too.

Louis-Backstrom commented 5 years ago

Yep, the build has been failing but I can't work out why.

The first few commits after 4909ff3 were to fix a couple of species name errors I'd made - e.g. Fregetta marina doesn't exist, so I changed it to Fregetta grallaria, which does, and the same for Cook's Petrel.

Since then though I've been getting a very bare bones error log -

output file: brisbane-bird-atlas.knit.md

Received 'killed' signal

I can't tell why this would be happening - it might have something to do with the order I remade the assets or something like that? I built the assets before pushing the new stuff as that was the order in the wiki but if I get time tonight or sometime this week I'll try rebuilding the assets again.

jeffreyhanson commented 5 years ago

It looks like the atlas is getting bigger and so >4GB RAM are needed to build the website. Since the service we use to build the website (CircleCI) has a limit of 4GB RAM per job (unless you want to pay), we will need to (1) get creative or (2) reduce the size of the atlas. This is going to be tricky, so I'll see what I can do but it might take a while before I find something that works.

dbl3raf commented 5 years ago

I'm very happy to pay - what do I need to sign up to?

jeffreyhanson commented 5 years ago

Unfortunately, CircleCI doesn't advertise the costs. The documentation says you have to contact a sales representative (https://circleci.com/docs/2.0/configuration-reference/#resource_class). Might be worth inquiring at least? If you do, maybe mention that this is an open source project? They might give a discount or something.

Louis-Backstrom commented 5 years ago

The only mention of costs on CircleCI - https://circleci.com/pricing/#build-linux - are quite expensive. Even just increasing our container count from 1 to 2 (which I'm not sure would affect the current issue?) would mean a $50/mo charge. Cost appears to scale linearly with each additional container.

Could we perhaps somehow build the website in two sections? I'm not really sure how it would work but if we could split the build up into two builds that might work.

Problem seems to be that with each commit (even just adding new text to species accounts, but even more so with each new feature) the atlas is going to get bigger, and we've got a fair while left before the project is "feature-complete" so to speak.

The only other possibility I see at the moment, although I'm sure there are others, is somehow using CircleCI's self-hosted building? I'm not really sure how it works, but if we could build the website and/or book on a local server (which would presumably have a sizeable upfront cost, but might work out cheaper over 5 years or whatever) and then use the CI infrastructure for the rest we might be able to get around the resource block somewhat more cheaply?

Louis-Backstrom commented 5 years ago

Trying to do as much fixing up as I can around the website itself failing for the moment. I'm currently in the process of a remake/repush of the assets (to update them to feb2019, see 952ed5a3ab648213dbfe2b5b0b2a423b61887bb9). The new species added in 4909ff3a6ff7d305d4a8c21792f08245c97d464b and subsequent commits appear to have only half added. To my knowledge they are present and functioning in:

They aren't in:

I can't work out why they wouldn't be in the surveyor sheets, given the creation of them works fine (I've done a complete rebuild of all the assets from ground up, so it's not just that the command doesn't see any difference between new and old), and they appear to have otherwise been added to the atlas fine (presumably the website would work if it had the resources).

Is there a separate list for what species occur on the surveyor sheet that I should have added the new species to?

The only other thought I have is that the table currently is limited to 2 pages (from the parameters.yaml file) - the list it's using at the moment has about 3 empty rows of space at the end, so perhaps, given there's more than 3 new species to add, instead of just adding them in where they should go and cutting off the list halfway through the "plastics", it's just silently killing the process altogether? I hope that vague thought train makes sense...

Edit: Just went and looked a bit deeper at the surveyor sheets. They've evidently updated to the next data update - the gordon park (10195) square has Nieppe Street on the map, a personal location which I only created in February and only appears in the Feb-2019 eBird release and not the Jan-2019 or prior releases. So the surveyor sheets are getting meaningfully changed, I just don't know why the species list isn't changing with it

dbl3raf commented 5 years ago

Unfortunately, CircleCI doesn't advertise the costs. The documentation says you have to contact a sales representative (https://circleci.com/docs/2.0/configuration-reference/#resource_class). Might be worth inquiring at least? If you do, maybe mention that this is an open source project? They might give a discount or something.

I've opened a support ticket with CircleCI and we'll see what the pricing options are. I've emphasized that we are a not-for-profit open source project

Louis-Backstrom commented 5 years ago

It looks like since b5f2eb4 or thereabouts the error message from CircleCI is much simpler than it was when builds first started dying:

Build-agent version 1.0.9008-0abaa7b9 (2019-03-13T15:08:16+0000)
Configuration errors: 1 error occurred:

* 1 error occurred:

* In job 'init': 1 error occurred:

* The job has no executor type specified. The job should have one of the following keys specified: "macos", "machine", "docker"

Have you changed an internal docker / CI setting that might be causing this @jeffreyhanson ? @dbl3raf suggested that you thought it might be a large image file or something causing a temporary problem (or symptomatic of broader out of memory issues). I've gone and removed the only image I think I've added in the last month in cb5ee77, but it doesn't seem like CCI even attempts to make it?

Hope that's clear

jeffreyhanson commented 5 years ago

Yeah, I was playing around with different build config options for CircleCI a while ago - but I didn't find a fix and forgot to reset it afterwards. I've just reset the CircleCI config file so we should go back to seeing Killed if it doesn't work.

jeffreyhanson commented 5 years ago

It just occurred to me that another potential fix might be reducing the file size of the interactive maps or the graphs. For the interactive maps, this could be achieved by further simplifying the Brisbane LGA geometry or removing the "Grid" option. For the static graphs, we could try saving them as JPEG files instead of PNG files.

jeffreyhanson commented 5 years ago

We might also be able to compress the JPEG files even further using compression optimization tools (e.g. https://www.tecmint.com/optimize-and-compress-jpeg-or-png-batch-images-linux-commandline/)

jeffreyhanson commented 5 years ago

I've just pushed a commit to use JPEG files for the graphs and also to optimize the compression of the JEPGS for the website. When the Docker build environment has finished rebuilding (https://hub.docker.com/r/brisbanebirdteam/build-env/builds), @dbl3raf or @Louis-Backstrom can you please rebuild the graph assets and push them to the releases? To just rebuild the graphs, (1) enter make pull_assets to fetch the assets from GitHub, (2) go into the assets/graphs/ folder, (3) delete all files in this folder, (4) enter make assets to rebuild the graphs, and (5) enter make push_assets to push the new graphs to GitHub.

Louis-Backstrom commented 5 years ago

Alright, will try that now. I'll update with progress.

jeffreyhanson commented 5 years ago

Awesome - thank you!

Louis-Backstrom commented 5 years ago

image Getting a lot of this at the moment. Nothing as yet in the /graphs folder - not sure if that's to be expected or not.

jeffreyhanson commented 5 years ago

Ah, the reason you're seeing that is because I forgot to include instructions to update the docker build environment on your computer. Could you please (1) kill this run, (2) execute the commands make rm_image and then make pull_image to get the last version of the build environment which has jpegoptim installed and (3) follow the instructions in the previous post?

Also, yeah that's normal, you won't see any files in the folder until the end when it finishes.

Louis-Backstrom commented 5 years ago

image

Running into problems with make rm_image - I've got no containers running and I restarted docker etc?

jeffreyhanson commented 5 years ago

Hmm, can you try running docker container rm XXXX where XXX is the string starting with e7f05... in the screenshot, and then try again?

Louis-Backstrom commented 5 years ago

bingo, worked - trying it all again now

jeffreyhanson commented 5 years ago

Awesome!

Louis-Backstrom commented 5 years ago

Alright, assets pushed to storage, commit pushed to make the build try again.

Of note already, new species still aren't showing up on the surveyor sheets...

Using the command to push the surveyor sheets - probably will fail while I'm out at uni for the next few hours

Louis-Backstrom commented 5 years ago

Right, predictably both the build (same error as previously) and surveyor sheet upload (got further than normal, capitulated with about 75 still to go) failed. I've manually put the rest of the sheets up into the release.

I suppose this means it's time to consider other options rather than staying with Circle CI, as per #134.

The original problem raised in this issue still stands though - the new species still aren't showing up on the surveyor sheets.

Louis-Backstrom commented 5 years ago

Have attempted to change the number of pages allocated to the surveyor sheet checklists in the latest commit - I'll try a rebuild of assets overnight to see if this changes anything. Just a hunch...

Louis-Backstrom commented 5 years ago

Right, fixed the main issue - turns out I hadn't added the TRUE value for the needed column in the species list. I think that means I can finally close this issue, as the rest of the problems addressed here have been migrated over to #134.

jeffreyhanson commented 5 years ago

Awesome - thank you very much @Louis-Backstrom .