Rewrite bundle population script to not download archives from SDA

This implementation should be considered outdated, since it has been revised by ticket #205. For releasing the bundle download feature to a bioloop instance, follow the release steps documented on #205, instead of the ones below.

This will require the following steps to be performed before/after release:

Before Release

Add the following properties to production.py:

config['paths']['RAW_DATA']['bundle']
config['paths']['DATA_PRODUCT']['bundle']

The values for these properties will depend on the env. The way to evaluate these properties' values is:

config['paths']['RAW_DATA']['bundle']        ->   config['paths']['RAW_DATA']['stage'] / bundles
config['paths']['DATA_PRODUCT']['bundle'] ->   config['paths']['DATA_PRODUCT']['stage'] / bundles

As examples, these values would resolve to the following in the Bioloop dev env:

config['paths']['RAW_DATA']['bundle']        ->   /N/scratch/scadev/bioloop/dev/staged/raw_data/bundles
config['paths']['DATA_PRODUCT']['bundle'] ->   /N/scratch/scadev/bioloop/dev/staged/data_product/bundles

Note: For Bioloop instances having multiple worker nodes (like CPA), these properties will need to be configured on the node that runs the stage_dataset, validate_dataset, and setup_dataset_download steps.

Do a Prisma migration so that the new bundle table is created
```
npx prisma migrate deploy
```
Deploy the application

After Release

Run the script populate_bundles. Ensure that the workers/ecosystem.config.js has the config for the script:

...,
    {
      name: "populate_bundles",
      script: "python",
      args: "-u -m workers.scripts.populate_bundles",
      watch: false,
      interpreter: "",
      log_date_format: "YYYY-MM-DD HH:mm Z",
      error_file: "../logs/workers/populate_bundles.err",
      out_file: "../logs/workers/populate_bundles.log",
      autorestart: false,
    }
...

For Bioloop instances having multiple worker nodes (like CPA), this script will need to be run on the node that runs the stage_dataset, validate_dataset, and setup_dataset_download steps.

Restart pm2.

After pm2 is restarted, the script workers/workers/scripts/populate_bundles.py will kick off automatically. This script will populate the bundle table for each dataset, and unstage all datasets. These datasets will have to go through the stage_dataset > validate_dataset > setup_dataset_download workflow for the bundles to become available for download.

Steps (after restarting poetry):

pm2 ls
# note down id of the populate_bundles task
pm2 restart [id of populate_bundles task]

Remove the entry for populate_bundles from workers/ecosystem.config.js
In a future release, remove the now redundant bundle_size column from the dataset table. Remove usages of bundle_size from the project as well. At the time of writing this, Bioloop has been updated to use the size from the bundle table instead of from the dataset table. Any usages of bundle_size in the project simply need to be removed. The original PR issued for ticket #102 (176) can help with determining the places in the code where bundle_size should be removed from.

IUSCA / bioloop

Rewrite bundle population script to not download archives from SDA #201

Before Release

After Release