gitter-lab / metl-pretrained

Pretrained METL models with minimal dependencies
MIT License
13 stars 0 forks source link

Add finetuned METL-Local and METL-Global target models #8

Open agitter opened 3 weeks ago

agitter commented 3 weeks ago

We will add these additional models to Zenodo

agitter commented 3 weeks ago

I wrote a short bash script to convert all the checkpoints for Zenodo using the code and environment at commit 45237bb of the METL repo.

#!/bin/bash
for f in finetuned_model_checkpoints/*/checkpoints/epoch*.ckpt; do
    python code/convert_ckpt.py --ckpt_path $f --output_dir models/
done

Script output

$ ./convert_checkpoints.sh
Processing checkpoint: finetuned_model_checkpoints/4Rh3WCbG/checkpoints/epoch=302-step=13635.ckpt
Saving converted checkpoint to: models/4Rh3WCbG.pt
Processing checkpoint: finetuned_model_checkpoints/4xbuC5y7/checkpoints/epoch=359-step=19080.ckpt
Saving converted checkpoint to: models/4xbuC5y7.pt
Processing checkpoint: finetuned_model_checkpoints/5SjoLx3y/checkpoints/epoch=275-step=65964.ckpt
Saving converted checkpoint to: models/5SjoLx3y.pt
Processing checkpoint: finetuned_model_checkpoints/64ncFxBR/checkpoints/epoch=415-step=32864.ckpt
Saving converted checkpoint to: models/64ncFxBR.pt
Processing checkpoint: finetuned_model_checkpoints/6JBzHpkQ/checkpoints/epoch=467-step=153504.ckpt
Saving converted checkpoint to: models/6JBzHpkQ.pt
Processing checkpoint: finetuned_model_checkpoints/9vSB3DRM/checkpoints/epoch=301-step=1024686.ckpt
Saving converted checkpoint to: models/9vSB3DRM.pt
Processing checkpoint: finetuned_model_checkpoints/BAWw23vW/checkpoints/epoch=283-step=159040.ckpt
Saving converted checkpoint to: models/BAWw23vW.pt
Processing checkpoint: finetuned_model_checkpoints/BuvxgE2x/checkpoints/epoch=340-step=18073.ckpt
Saving converted checkpoint to: models/BuvxgE2x.pt
Processing checkpoint: finetuned_model_checkpoints/ELL4GGQq/checkpoints/epoch=415-step=32864.ckpt
Saving converted checkpoint to: models/ELL4GGQq.pt
Processing checkpoint: finetuned_model_checkpoints/G9piq6WH/checkpoints/epoch=284-step=159600.ckpt
Saving converted checkpoint to: models/G9piq6WH.pt
Processing checkpoint: finetuned_model_checkpoints/HaUuRwfE/checkpoints/epoch=277-step=91184.ckpt
Saving converted checkpoint to: models/HaUuRwfE.pt
Processing checkpoint: finetuned_model_checkpoints/HenDpDWe/checkpoints/epoch=309-step=124310.ckpt
Saving converted checkpoint to: models/HenDpDWe.pt
Processing checkpoint: finetuned_model_checkpoints/K6BjsWXm/checkpoints/epoch=419-step=33180.ckpt
Saving converted checkpoint to: models/K6BjsWXm.pt
Processing checkpoint: finetuned_model_checkpoints/LWEY95Yb/checkpoints/epoch=327-step=107584.ckpt
Saving converted checkpoint to: models/LWEY95Yb.pt
Processing checkpoint: finetuned_model_checkpoints/NfbZL7jK/checkpoints/epoch=266-step=149520.ckpt
Saving converted checkpoint to: models/NfbZL7jK.pt
Processing checkpoint: finetuned_model_checkpoints/PeT2D92j/checkpoints/epoch=334-step=109880.ckpt
Saving converted checkpoint to: models/PeT2D92j.pt
Processing checkpoint: finetuned_model_checkpoints/Pgcseywk/checkpoints/epoch=276-step=939861.ckpt
Saving converted checkpoint to: models/Pgcseywk.pt
Processing checkpoint: finetuned_model_checkpoints/PncvgiJU/checkpoints/epoch=401-step=31758.ckpt
Saving converted checkpoint to: models/PncvgiJU.pt
Processing checkpoint: finetuned_model_checkpoints/PqBMjXkA/checkpoints/epoch=270-step=108671.ckpt
Saving converted checkpoint to: models/PqBMjXkA.pt
Processing checkpoint: finetuned_model_checkpoints/RBtqxzvu/checkpoints/epoch=303-step=13680.ckpt
Saving converted checkpoint to: models/RBtqxzvu.pt
Processing checkpoint: finetuned_model_checkpoints/RMFA6dnX/checkpoints/epoch=276-step=12465.ckpt
Saving converted checkpoint to: models/RMFA6dnX.pt
Processing checkpoint: finetuned_model_checkpoints/TdjCzoQQ/checkpoints/epoch=280-step=67159.ckpt
Saving converted checkpoint to: models/TdjCzoQQ.pt
Processing checkpoint: finetuned_model_checkpoints/UvMMdsq4/checkpoints/epoch=286-step=973791.ckpt
Saving converted checkpoint to: models/UvMMdsq4.pt
Processing checkpoint: finetuned_model_checkpoints/V3uTtXVe/checkpoints/epoch=282-step=12735.ckpt
Saving converted checkpoint to: models/V3uTtXVe.pt
Processing checkpoint: finetuned_model_checkpoints/VNpi9Zjt/checkpoints/epoch=277-step=111478.ckpt
Saving converted checkpoint to: models/VNpi9Zjt.pt
Processing checkpoint: finetuned_model_checkpoints/VwcRN6UB/checkpoints/epoch=278-step=59148.ckpt
Saving converted checkpoint to: models/VwcRN6UB.pt
Processing checkpoint: finetuned_model_checkpoints/YdzBYWHs/checkpoints/epoch=306-step=16271.ckpt
Saving converted checkpoint to: models/YdzBYWHs.pt
Processing checkpoint: finetuned_model_checkpoints/Z59BhUaE/checkpoints/epoch=279-step=59360.ckpt
Saving converted checkpoint to: models/Z59BhUaE.pt
Processing checkpoint: finetuned_model_checkpoints/cvnycE5Q/checkpoints/epoch=312-step=66356.ckpt
Saving converted checkpoint to: models/cvnycE5Q.pt
Processing checkpoint: finetuned_model_checkpoints/dAndZfJ4/checkpoints/epoch=283-step=963612.ckpt
Saving converted checkpoint to: models/dAndZfJ4.pt
Processing checkpoint: finetuned_model_checkpoints/dDoCCvfr/checkpoints/epoch=307-step=123508.ckpt
Saving converted checkpoint to: models/dDoCCvfr.pt
Processing checkpoint: finetuned_model_checkpoints/e9uhhnAv/checkpoints/epoch=264-step=148400.ckpt
Saving converted checkpoint to: models/e9uhhnAv.pt
Processing checkpoint: finetuned_model_checkpoints/ho54gxzv/checkpoints/epoch=303-step=72656.ckpt
Saving converted checkpoint to: models/ho54gxzv.pt
Processing checkpoint: finetuned_model_checkpoints/iu6ZahPw/checkpoints/epoch=294-step=15635.ckpt
Saving converted checkpoint to: models/iu6ZahPw.pt
Processing checkpoint: finetuned_model_checkpoints/jYesS9Ki/checkpoints/epoch=309-step=65720.ckpt
Saving converted checkpoint to: models/jYesS9Ki.pt
Processing checkpoint: finetuned_model_checkpoints/jhbL2FeB/checkpoints/epoch=285-step=68354.ckpt
Saving converted checkpoint to: models/jhbL2FeB.pt

Sam selected those checkpoints from the epoch with the lowest validation loss.

We'll use the attached index.csv to update the readme.

John-Peters-UW commented 2 weeks ago

mv ./models/PeT2D92j ./models/METL_G_20M_1D_avgfp mv ./models/6JBzHpkQ ./models/METL_G_20M_3D_avgfp mv ./models/HaUuRwfE ./models/METL_L_1D_avgfp mv ./models/LWEY95Yb ./models/METL_L_3D_avgfp mv ./models/4Rh3WCbG ./models/METL_G_20M_1D_dlg4-2022-abundance mv ./models/RBtqxzvu ./models/METL_G_20M_3D_dlg4-2022-abundance mv ./models/RMFA6dnX ./models/METL_L_1D_dlg4-2022-abundance mv ./models/V3uTtXVe ./models/METL_L_3D_dlg4-2022-abundance mv ./models/4xbuC5y7 ./models/METL_G_20M_1D_dlg4-2022-binding mv ./models/BuvxgE2x ./models/METL_G_20M_3D_dlg4-2022-binding mv ./models/YdzBYWHs ./models/METL_L_1D_dlg4-2022-binding mv ./models/iu6ZahPw ./models/METL_L_3D_dlg4-2022-binding mv ./models/dAndZfJ4 ./models/METL_G_20M_1D_gb1 mv ./models/9vSB3DRM ./models/METL_G_20M_3D_gb1 mv ./models/Pgcseywk ./models/METL_L_1D_gb1 mv ./models/UvMMdsq4 ./models/METL_L_3D_gb1 mv ./models/HenDpDWe ./models/METL_G_20M_1D_grb2-abundance mv ./models/dDoCCvfr ./models/METL_G_20M_3D_grb2-abundance mv ./models/VNpi9Zjt ./models/METL_L_1D_grb2-abundance mv ./models/PqBMjXkA ./models/METL_L_3D_grb2-abundance mv ./models/cvnycE5Q ./models/METL_G_20M_1D_grb2-binding mv ./models/jYesS9Ki ./models/METL_G_20M_3D_grb2-binding mv ./models/Z59BhUaE ./models/METL_L_1D_grb2-binding mv ./models/VwcRN6UB ./models/METL_L_3D_grb2-binding mv ./models/ho54gxzv ./models/METL_G_20M_1D_pab1 mv ./models/jhbL2FeB ./models/METL_G_20M_3D_pab1 mv ./models/TdjCzoQQ ./models/METL_L_1D_pab1 mv ./models/5SjoLx3y ./models/METL_L_3D_pab1 mv ./models/ELL4GGQq ./models/METL_G_20M_1D_tem-1 mv ./models/K6BjsWXm ./models/METL_G_20M_3D_tem-1 mv ./models/64ncFxBR ./models/METL_L_1D_tem-1 mv ./models/PncvgiJU ./models/METL_L_3D_tem-1 mv ./models/BAWw23vW ./models/METL_G_20M_1D_ube4b mv ./models/G9piq6WH ./models/METL_G_20M_3D_ube4b mv ./models/e9uhhnAv ./models/METL_L_1D_ube4b mv ./models/NfbZL7jK ./models/METL_L_3D_ube4b

John-Peters-UW commented 2 weeks ago

no promises that it's correct. I don't know when to use the syntax here

vs here

But for most of the models it looks correct

Script:

import os
import polars as pl

index = pl.read_csv('./index.csv', separator=',')
#index.head()

for row in index.rows(named=True):
    uuid:str = row['uuid']
    plt_name:str = row['plot_name']
    ds_name:str = row['ds_name']
    new_name = plt_name.upper()
    new_name = new_name.replace('GLOBAL', 'G')
    new_name = new_name.replace('LOCAL', 'L')
    tail_idx = new_name.index('D_')
    new_name = new_name[:tail_idx+1]
    new_name = new_name + f'_{ds_name}'
    print(f'mv ./models/{uuid} ./models/{new_name}')
John-Peters-UW commented 2 weeks ago

Actually not sure when to caps the ending for these either. It's inconsistent in the tables in metl-pretrained so it's just whatever the ds_name is for the ones I printed. I think GB1 needs all cap but Pab1 doesn't and it's confusing

agitter commented 2 weeks ago

This looks great. We may need to modify the final line in the Python script to

print(f'mv ./models/{uuid}.pt ./models/{new_name}-{uuid}.pt')

Actually not sure when to caps the ending for these either.

That mostly comes from how the proteins and domains are referred to in literature, so it isn't entirely consistent.

samgelman commented 2 weeks ago

For model filenames, I suggest prefixing the name with FT to signify finetuned, followed by the ident of the base METL model, followed by the UUID. For instance: FT-METL-L-2M-1D-GFP-HaUuRwfE.pt and FT-METL-G-20M-3D-6JBzHpkQ.pt.

John-Peters-UW commented 2 weeks ago
mv ./models/PeT2D92j ./models/FT-METL-G-20M-1D-avGFP-PeT2D92j.pt
mv ./models/6JBzHpkQ ./models/FT-METL-G-20M-3D-avGFP-6JBzHpkQ.pt
mv ./models/HaUuRwfE ./models/FT-METL-L-1D-avGFP-HaUuRwfE.pt
mv ./models/LWEY95Yb ./models/FT-METL-L-3D-avGFP-LWEY95Yb.pt
mv ./models/4Rh3WCbG ./models/FT-METL-G-20M-1D-dlg4-2022-abundance-4Rh3WCbG.pt
mv ./models/RBtqxzvu ./models/FT-METL-G-20M-3D-dlg4-2022-abundance-RBtqxzvu.pt
mv ./models/RMFA6dnX ./models/FT-METL-L-1D-dlg4-2022-abundance-RMFA6dnX.pt
mv ./models/V3uTtXVe ./models/FT-METL-L-3D-dlg4-2022-abundance-V3uTtXVe.pt
mv ./models/4xbuC5y7 ./models/FT-METL-G-20M-1D-dlg4-2022-binding-4xbuC5y7.pt
mv ./models/BuvxgE2x ./models/FT-METL-G-20M-3D-dlg4-2022-binding-BuvxgE2x.pt
mv ./models/YdzBYWHs ./models/FT-METL-L-1D-dlg4-2022-binding-YdzBYWHs.pt
mv ./models/iu6ZahPw ./models/FT-METL-L-3D-dlg4-2022-binding-iu6ZahPw.pt
mv ./models/dAndZfJ4 ./models/FT-METL-G-20M-1D-GB1-dAndZfJ4.pt
mv ./models/9vSB3DRM ./models/FT-METL-G-20M-3D-GB1-9vSB3DRM.pt
mv ./models/Pgcseywk ./models/FT-METL-L-1D-GB1-Pgcseywk.pt
mv ./models/UvMMdsq4 ./models/FT-METL-L-3D-GB1-UvMMdsq4.pt
mv ./models/HenDpDWe ./models/FT-METL-G-20M-1D-Grb2-abundance-HenDpDWe.pt
mv ./models/dDoCCvfr ./models/FT-METL-G-20M-3D-Grb2-abundance-dDoCCvfr.pt
mv ./models/VNpi9Zjt ./models/FT-METL-L-1D-Grb2-abundance-VNpi9Zjt.pt
mv ./models/PqBMjXkA ./models/FT-METL-L-3D-Grb2-abundance-PqBMjXkA.pt
mv ./models/cvnycE5Q ./models/FT-METL-G-20M-1D-Grb2-binding-cvnycE5Q.pt
mv ./models/jYesS9Ki ./models/FT-METL-G-20M-3D-Grb2-binding-jYesS9Ki.pt
mv ./models/Z59BhUaE ./models/FT-METL-L-1D-Grb2-binding-Z59BhUaE.pt
mv ./models/VwcRN6UB ./models/FT-METL-L-3D-Grb2-binding-VwcRN6UB.pt
mv ./models/ho54gxzv ./models/FT-METL-G-20M-1D-pab1-ho54gxzv.pt
mv ./models/jhbL2FeB ./models/FT-METL-G-20M-3D-pab1-jhbL2FeB.pt
mv ./models/TdjCzoQQ ./models/FT-METL-L-1D-pab1-TdjCzoQQ.pt
mv ./models/5SjoLx3y ./models/FT-METL-L-3D-pab1-5SjoLx3y.pt
mv ./models/ELL4GGQq ./models/FT-METL-G-20M-1D-TEM-1-ELL4GGQq.pt
mv ./models/K6BjsWXm ./models/FT-METL-G-20M-3D-TEM-1-K6BjsWXm.pt
mv ./models/64ncFxBR ./models/FT-METL-L-1D-TEM-1-64ncFxBR.pt
mv ./models/PncvgiJU ./models/FT-METL-L-3D-TEM-1-PncvgiJU.pt
mv ./models/BAWw23vW ./models/FT-METL-G-20M-1D-UBE4B-BAWw23vW.pt
mv ./models/G9piq6WH ./models/FT-METL-G-20M-3D-UBE4B-G9piq6WH.pt
mv ./models/e9uhhnAv ./models/FT-METL-L-1D-UBE4B-e9uhhnAv.pt
mv ./models/NfbZL7jK ./models/FT-METL-L-3D-UBE4B-NfbZL7jK.pt
for row in index.rows(named=True):
    uuid:str = row['uuid']
    plt_name:str = row['plot_name']
    ds_name:str = row['ds_name']

    new_name = plt_name.upper()
    new_name = new_name.replace('GLOBAL', 'G')
    new_name = new_name.replace('LOCAL', 'L')
    tail_idx = new_name.index('D_')
    new_name = new_name[:tail_idx+1]
    new_name = new_name + f'_{ds_name}'
    new_name = new_name.replace('_', '-')
    new_name = new_name.replace('gb1', 'GB1')
    new_name = new_name.replace('avgfp', 'avGFP')
    new_name = new_name.replace('DLG4', 'DLG4')
    new_name = new_name.replace('gb1', 'GB1')
    new_name = new_name.replace('grb2', 'Grb2')
    new_name = new_name.replace('tem', 'TEM')
    new_name = new_name.replace('ube4b', 'UBE4B')

    print(f'mv ./models/{uuid} ./models/FT-{new_name}-{uuid}.pt')
agitter commented 2 weeks ago

Thanks for working this out. The script above was missing the .pt from the original model names so I used string replace to update it.

#!/bin/bash
mv ./models/PeT2D92j.pt ./models/FT-METL-G-20M-1D-avGFP-PeT2D92j.pt
mv ./models/6JBzHpkQ.pt ./models/FT-METL-G-20M-3D-avGFP-6JBzHpkQ.pt
mv ./models/HaUuRwfE.pt ./models/FT-METL-L-1D-avGFP-HaUuRwfE.pt
mv ./models/LWEY95Yb.pt ./models/FT-METL-L-3D-avGFP-LWEY95Yb.pt
mv ./models/4Rh3WCbG.pt ./models/FT-METL-G-20M-1D-dlg4-2022-abundance-4Rh3WCbG.pt
mv ./models/RBtqxzvu.pt ./models/FT-METL-G-20M-3D-dlg4-2022-abundance-RBtqxzvu.pt
mv ./models/RMFA6dnX.pt ./models/FT-METL-L-1D-dlg4-2022-abundance-RMFA6dnX.pt
mv ./models/V3uTtXVe.pt ./models/FT-METL-L-3D-dlg4-2022-abundance-V3uTtXVe.pt
mv ./models/4xbuC5y7.pt ./models/FT-METL-G-20M-1D-dlg4-2022-binding-4xbuC5y7.pt
mv ./models/BuvxgE2x.pt ./models/FT-METL-G-20M-3D-dlg4-2022-binding-BuvxgE2x.pt
mv ./models/YdzBYWHs.pt ./models/FT-METL-L-1D-dlg4-2022-binding-YdzBYWHs.pt
mv ./models/iu6ZahPw.pt ./models/FT-METL-L-3D-dlg4-2022-binding-iu6ZahPw.pt
mv ./models/dAndZfJ4.pt ./models/FT-METL-G-20M-1D-GB1-dAndZfJ4.pt
mv ./models/9vSB3DRM.pt ./models/FT-METL-G-20M-3D-GB1-9vSB3DRM.pt
mv ./models/Pgcseywk.pt ./models/FT-METL-L-1D-GB1-Pgcseywk.pt
mv ./models/UvMMdsq4.pt ./models/FT-METL-L-3D-GB1-UvMMdsq4.pt
mv ./models/HenDpDWe.pt ./models/FT-METL-G-20M-1D-Grb2-abundance-HenDpDWe.pt
mv ./models/dDoCCvfr.pt ./models/FT-METL-G-20M-3D-Grb2-abundance-dDoCCvfr.pt
mv ./models/VNpi9Zjt.pt ./models/FT-METL-L-1D-Grb2-abundance-VNpi9Zjt.pt
mv ./models/PqBMjXkA.pt ./models/FT-METL-L-3D-Grb2-abundance-PqBMjXkA.pt
mv ./models/cvnycE5Q.pt ./models/FT-METL-G-20M-1D-Grb2-binding-cvnycE5Q.pt
mv ./models/jYesS9Ki.pt ./models/FT-METL-G-20M-3D-Grb2-binding-jYesS9Ki.pt
mv ./models/Z59BhUaE.pt ./models/FT-METL-L-1D-Grb2-binding-Z59BhUaE.pt
mv ./models/VwcRN6UB.pt ./models/FT-METL-L-3D-Grb2-binding-VwcRN6UB.pt
mv ./models/ho54gxzv.pt ./models/FT-METL-G-20M-1D-pab1-ho54gxzv.pt
mv ./models/jhbL2FeB.pt ./models/FT-METL-G-20M-3D-pab1-jhbL2FeB.pt
mv ./models/TdjCzoQQ.pt ./models/FT-METL-L-1D-pab1-TdjCzoQQ.pt
mv ./models/5SjoLx3y.pt ./models/FT-METL-L-3D-pab1-5SjoLx3y.pt
mv ./models/ELL4GGQq.pt ./models/FT-METL-G-20M-1D-TEM-1-ELL4GGQq.pt
mv ./models/K6BjsWXm.pt ./models/FT-METL-G-20M-3D-TEM-1-K6BjsWXm.pt
mv ./models/64ncFxBR.pt ./models/FT-METL-L-1D-TEM-1-64ncFxBR.pt
mv ./models/PncvgiJU.pt ./models/FT-METL-L-3D-TEM-1-PncvgiJU.pt
mv ./models/BAWw23vW.pt ./models/FT-METL-G-20M-1D-UBE4B-BAWw23vW.pt
mv ./models/G9piq6WH.pt ./models/FT-METL-G-20M-3D-UBE4B-G9piq6WH.pt
mv ./models/e9uhhnAv.pt ./models/FT-METL-L-1D-UBE4B-e9uhhnAv.pt
mv ./models/NfbZL7jK.pt ./models/FT-METL-L-3D-UBE4B-NfbZL7jK.pt

Once I started uploading models I noticed some of the names didn't match (avGFP vs. GFP). I used the local source model identifiers as the reference dataset names

METL-L-2M-1D-GFP
METL-L-2M-1D-DLG4_2022
METL-L-2M-1D-GB1
METL-L-2M-1D-GRB2
METL-L-2M-1D-Pab1
METL-L-2M-1D-TEM-1
METL-L-2M-1D-Ube4b

I adjusted those with a second script.

#!/bin/bash
mv ./models/FT-METL-G-20M-1D-avGFP-PeT2D92j.pt ./models/FT-METL-G-20M-1D-GFP-PeT2D92j.pt
mv ./models/FT-METL-G-20M-3D-avGFP-6JBzHpkQ.pt ./models/FT-METL-G-20M-3D-GFP-6JBzHpkQ.pt
mv ./models/FT-METL-L-1D-avGFP-HaUuRwfE.pt ./models/FT-METL-L-1D-GFP-HaUuRwfE.pt
mv ./models/FT-METL-L-3D-avGFP-LWEY95Yb.pt ./models/FT-METL-L-3D-GFP-LWEY95Yb.pt
mv ./models/FT-METL-G-20M-1D-dlg4-2022-abundance-4Rh3WCbG.pt ./models/FT-METL-G-20M-1D-DLG4_2022-ABUNDANCE-4Rh3WCbG.pt
mv ./models/FT-METL-G-20M-3D-dlg4-2022-abundance-RBtqxzvu.pt ./models/FT-METL-G-20M-3D-DLG4_2022-ABUNDANCE-RBtqxzvu.pt
mv ./models/FT-METL-L-1D-dlg4-2022-abundance-RMFA6dnX.pt ./models/FT-METL-L-1D-DLG4_2022-ABUNDANCE-RMFA6dnX.pt
mv ./models/FT-METL-L-3D-dlg4-2022-abundance-V3uTtXVe.pt ./models/FT-METL-L-3D-DLG4_2022-ABUNDANCE-V3uTtXVe.pt
mv ./models/FT-METL-G-20M-1D-dlg4-2022-binding-4xbuC5y7.pt ./models/FT-METL-G-20M-1D-DLG4_2022-BINDING-4xbuC5y7.pt
mv ./models/FT-METL-G-20M-3D-dlg4-2022-binding-BuvxgE2x.pt ./models/FT-METL-G-20M-3D-DLG4_2022-BINDING-BuvxgE2x.pt
mv ./models/FT-METL-L-1D-dlg4-2022-binding-YdzBYWHs.pt ./models/FT-METL-L-1D-DLG4_2022-BINDING-YdzBYWHs.pt
mv ./models/FT-METL-L-3D-dlg4-2022-binding-iu6ZahPw.pt ./models/FT-METL-L-3D-DLG4_2022-BINDING-iu6ZahPw.pt
mv ./models/FT-METL-G-20M-1D-Grb2-abundance-HenDpDWe.pt ./models/FT-METL-G-20M-1D-GRB2-ABUNDANCE-HenDpDWe.pt
mv ./models/FT-METL-G-20M-3D-Grb2-abundance-dDoCCvfr.pt ./models/FT-METL-G-20M-3D-GRB2-ABUNDANCE-dDoCCvfr.pt
mv ./models/FT-METL-L-1D-Grb2-abundance-VNpi9Zjt.pt ./models/FT-METL-L-1D-GRB2-ABUNDANCE-VNpi9Zjt.pt
mv ./models/FT-METL-L-3D-Grb2-abundance-PqBMjXkA.pt ./models/FT-METL-L-3D-GRB2-ABUNDANCE-PqBMjXkA.pt
mv ./models/FT-METL-G-20M-1D-Grb2-binding-cvnycE5Q.pt ./models/FT-METL-G-20M-1D-GRB2-BINDING-cvnycE5Q.pt
mv ./models/FT-METL-G-20M-3D-Grb2-binding-jYesS9Ki.pt ./models/FT-METL-G-20M-3D-GRB2-BINDING-jYesS9Ki.pt
mv ./models/FT-METL-L-1D-Grb2-binding-Z59BhUaE.pt ./models/FT-METL-L-1D-GRB2-BINDING-Z59BhUaE.pt
mv ./models/FT-METL-L-3D-Grb2-binding-VwcRN6UB.pt ./models/FT-METL-L-3D-GRB2-BINDING-VwcRN6UB.pt
mv ./models/FT-METL-G-20M-1D-pab1-ho54gxzv.pt ./models/FT-METL-G-20M-1D-Pab1-ho54gxzv.pt
mv ./models/FT-METL-G-20M-3D-pab1-jhbL2FeB.pt ./models/FT-METL-G-20M-3D-Pab1-jhbL2FeB.pt
mv ./models/FT-METL-L-1D-pab1-TdjCzoQQ.pt ./models/FT-METL-L-1D-Pab1-TdjCzoQQ.pt
mv ./models/FT-METL-L-3D-pab1-5SjoLx3y.pt ./models/FT-METL-L-3D-Pab1-5SjoLx3y.pt
mv ./models/FT-METL-G-20M-1D-UBE4B-BAWw23vW.pt ./models/FT-METL-G-20M-1D-Ube4b-BAWw23vW.pt
mv ./models/FT-METL-G-20M-3D-UBE4B-G9piq6WH.pt ./models/FT-METL-G-20M-3D-Ube4b-G9piq6WH.pt
mv ./models/FT-METL-L-1D-UBE4B-e9uhhnAv.pt ./models/FT-METL-L-1D-Ube4b-e9uhhnAv.pt
mv ./models/FT-METL-L-3D-UBE4B-NfbZL7jK.pt ./models/FT-METL-L-3D-Ube4b-NfbZL7jK.pt

Zenodo has a limit of 100 files per repository and then requires depositors to archive individual files. We're only at 58 files, but that is worth considering if we add many more DMS datasets and target models.

Here is a preview of the Zenodo dataset with the new files. If this looks good, I'll release it.

samgelman commented 2 weeks ago

Looks good to me. You can go ahead and release it!

agitter commented 2 weeks ago

I published version 2.0: https://doi.org/10.5281/zenodo.13377502

I'm leaving this open until we update the readme describing all the new models and the models in main.py. All of the existing Zenodo URLs from version 1.0 like https://zenodo.org/records/11051645/files/METL-G-20M-1D-D72M9aEp.pt?download=1 will still work, but we can also replace the version identify in the URL to https://zenodo.org/records/13377502/files/METL-G-20M-1D-D72M9aEp.pt?download=1 if we want. I copied all version 1.0 files to version 2.0 as well.

Our naming convention isn't very clear about distinguishing the METL-Local GFP models that were training on most of the data versus the low-N models:

FT-METL-L-2M-3D-GFP-PEkeRuxb.pt
FT-METL-L-3D-GFP-LWEY95Yb.pt

so let's be sure that is clear in the readme.

The final step will be updating the model list in the Colab notebook.