This PR updates the data/bdc_dbgap_ids.csv file with the latest dbGaP identifiers from the BDC Gen3 instance. It also fixes some issues with bin/get_dbgap_data_dicts.py when downloading from FTP:
We used to get the list of files in a directory from FTP, download the files from the corresponding HTTP server, and then try to get another list of files from FTP. But in between the two steps the FTP server times out and disconnects. We now explicitly close the connection after getting the list of files, then open it again before getting the next list of files.
If a download fails, we now try to download the local directory for that variable as it will either be empty or incomplete. Re-running the script causes any variables not already downloaded to be downloaded again.
This PR updates the
data/bdc_dbgap_ids.csv
file with the latest dbGaP identifiers from the BDC Gen3 instance. It also fixes some issues withbin/get_dbgap_data_dicts.py
when downloading from FTP: