SETI / rms-website

rings node website port to jekyll
Apache License 2.0
1 stars 3 forks source link

Deploy script output is unhelpfully verbose #118

Open matthewtiscareno opened 1 year ago

matthewtiscareno commented 1 year ago

When I run one of the scripts in the deploy directory, it first asks me to approve the results of a diff command that consist only of the files that are actually going to be changed, which is good. However, once I approve the change, the resulting rsync command lists every directory in the file structure as well as the files that are changed. Furthermore, the final two rsync commands list not only every directory but every file in the file structure.

Perhaps this is happening because Jekyll has regenerated all files and/or has reset timestamps on all directories, rather than only changing those files whose source code has changed?

rfrenchseti commented 1 year ago

I'm aware of the problem with rsync displaying directories. I tried a variety of ways to suppress them without also suppressing directories that really did change (e.g. added or deleted), but I didn't find anything that was both accurate and efficient.

Yes, I think the problem is that jekyll regenerates all the files each time, thus destroying the timestamps.

The one solution to both of these problems is to use the --checksum option to rsync, which ignores timestamps and uses file checksums instead. This does work as expected, but it's also much slower due to the large amount of non-jekyll files present in our directory structure.

I think the right solution is to wait until Debby separates out the jekyll from the non-jekyll files and then switch to using the --checksum option. It will be much faster because it has much less data to compare. The non-jekyll files can be synced using their timestamps since they aren't constantly regenerated.

rfrenchseti commented 1 year ago

BTW, this is why I chose to use diff for the initial comparison instead of rsync -n. I could make the diff output clean because it only compared the file contents. It's fast because it's operating on the local disk - comparing the jekyll-generated _site directory to the current /var/www/documents. The rsync operations with --checksum are slow because they all go to the RAID, which is vastly slower to read from. Only checking the timestamps is a much faster operation, which is why that's what we do right now.

matthewtiscareno commented 1 year ago

Is there no way to avoid having Jekyll regenerate every file?

rfrenchseti commented 1 year ago

It turns out the --incremental option works with static builds as well as while running the jekyll server. I updated the deploy script to use it when building the website. With this modification, the rsync to staging and the rsync to the RAID is clean (no extraneous directories or files). The rsync from the RAID to the production servers still shows a few files, mainly in the roses directory. I don't know why. But it's a lot nicer than it was before. We can leave this issue open and I'll look into it some other time.

matthewtiscareno commented 1 year ago

Great!!!