luoyunan / ECNet

An evolutionary context-integrated deep learning framework for protein engineering
BSD 3-Clause "New" or "Revised" License
63 stars 16 forks source link

Best way to make new predictions? #5

Open jrhorne opened 2 years ago

jrhorne commented 2 years ago

Hello, what's the best way to generate predictions on new mutations? I have a data frame of double mutations to test but was not sure about the best way to query a trained ECNet model.

I tried test_results = ecnet.test(test_df=double_df, save_prediction=True), but this resulted in an error regarding a test_loader being non-iterable (as it is set to None).

Any thoughts would be greatly appreciated! Thanks.

luoyunan commented 2 years ago

I pushed a commit. Now you can use ECNet to predict for another new dataset.

The following example shows how to train ECNet on single-mutant fitness data of RRM (passed via --train) and test it on double-mutants dataset (passed via --test).

CUDA_VISIBLE_DEVICES=0 python scripts/run_example.py \
    --train data/RRM_single.tsv \
    --test data/RRM_double.tsv \
    --fasta data/RRM.fasta \
    --local_feature data/RRM.braw \
    --output_dir ./output/RRM \
    --save_checkpoint \
    --n_ensembles 2 \
    --epochs 100

You can also re-use the trained model (without training from scratch) with the --save_model_dir argument:

  CUDA_VISIBLE_DEVICES=0 python scripts/run_example.py \
      --test data/RRM_double.tsv \
      --fasta data/RRM.fasta \
      --local_feature data/RRM.braw \
      --n_ensembles 2 \
      --output_dir ./output/RRM \
      --saved_model_dir ./output/RRM