debbiemarkslab / plmc

Inference of couplings in proteins and RNAs from sequence variation
MIT License
104 stars 37 forks source link

Focus sequence non-functional if alphabet provided #9

Open njrollins opened 4 years ago

njrollins commented 4 years ago

When alphabet is specified, all sequence positions are modeled ignoring uppercase/lowercase

Without alphabet ( assumes protein alphabet = a problem for RNAs )

plmc -c my.model -f target_seq -t 0.10  test.a2m
Found focus target_seq as sequence 1
0 valid sequences out of 1 
433 sites out of 450

With RNA alphabet = all positions are modeled, even lower-case

plmc -c my.model -f target_seq -t 0.10  -a -AUGC test.a2m
Found focus target_seq as sequence 1
1 valid sequences out of 1 
450 sites out of 450
joshuaroll commented 4 years ago

This update should exclude lower-case from being modeled once a custom alphabet is provided: https://github.com/debbiemarkslab/plmc/pull/10#issue-469703712