Closed juhuntenburg closed 5 months ago
Aggregates didn't run because the previous sync had warnings. I fixed those and made a note in the recipe to look out for that
When no new datasets need syncing, the AWS sync is pretty ok. What's very long is the symlink creation. Check if we can do something about that
2024-01-03 17:24:06 Creating symlinks creating symlinks 20000/70338 creating symlinks 40000/70338 creating symlinks 60000/70338 2024-01-03 20:25:25 Syncing to public S3 bucket data/ 2024-01-03 21:07:54 Syncing to public S3 bucket aggregates/ 2024-01-03 21:08:07 Finished
Tried to optimize symlinks part but for now doesn't seem like we can save a lot of time as we need to loop through all datasets to create the paths. Don't see much of a way forward other than only creating symlinks for newly released datasets, but I kind of like that we check everything and make sure it's clean every release. Will leave it like this for now
commit 6ec6c83
https://github.com/int-brain-lab/iblalyx/blob/main/releases/03_copy_data.sh
This takes too long Also, the aggregates sync doesn't seem to be run, unclear why