Open awaelchli opened 1 year ago
Hi @awaelchli, I ran into this issue as well. Based on the discussion in this AlphaFold issue, I tried to modify the download server to use the PDBj mirror. Unfortunately, while this downloads data, it does not access the correct snapshot:
rsync --recursive --links --perms --times --compress -v --info=progress2 --delete data.pdbj.org::ftp_data/structures/divided/mmCIF/ $OUT_DIR
Simply switching the server to pdbj via this command seems to also struggle to download.
rsync -rlpt -v -z --delete snapshots.pdbj.org::20220103/pub/pdb/data/structures/divided/mmCIF/ $OUT_DIR
.
@awaelchli I looked into this a bit more. I am fairly confident that this command can serve as a replacement to the original offending command. Of course it would be great if someone else can verify that this is reasonable.
aws s3 cp --no-sign-request s3://pdbsnapshots/20220103/pub/pdb/data/structures/divided/mmCIF $OUT_DIR --r ecursive 2>&1 > /dev/null
The RCSB in April of last year starting using AWS. This command just downloads the relevant snapshot used by RODA via the AWS CLI.
The download script download_roda_pdbs.sh has this rsync command:
https://github.com/aqlaboratory/openfold/blob/84659c93ba6f06b8a0a2646d1cf27646c003a0c6/scripts/download_roda_pdbs.sh#L35
It doesn't download anything, perhaps because there aren't any files at the given path on the server. I'm following the README instructions after having downloaded the RODA files and running flatten_roda.sh.
Any ideas how to download these files?