debbiemarkslab / EVcouplings

Evolutionary couplings from protein and RNA sequence alignments
http://evcouplings.org
Other
240 stars 76 forks source link

fix: match plmc log style #283

Closed TTTPOB closed 1 year ago

TTTPOB commented 1 year ago

on my machine, plmc log looks like

Found focus O88935_B2RQR8 as sequence 1
6 valid sequences out of 6 
1468 sites out of 1469
Region starts at 1
Effective number of samples (to 1 decimal place): 1.0   (60% identical neighborhood = 1.000 samples)

which does not match the extraction part of parse_plmc_log in https://github.com/debbiemarkslab/EVcouplings/blob/2590f9ee54cbd59bd7ff0bd59875d5a5aac0860b/evcouplings/couplings/tools.py#L55

It will cause EVcouplings fails to run subsequent steps.

I made a minor adjustment on the regex to match it.

thomashopf commented 1 year ago

Thanks for catching this - looks like the output format changed slightly in the latest plmc update

@aaronkollasch Could you share more details about this - Is there a strong reason to change the plmc output?

aaronkollasch commented 1 year ago

Thanks for the catch @TTTPOB

I just looked back at the relevant plmc PR and it seems that I missed this change to the plmc output. I'm fine with reverting that change on the plmc side.

It does suggest that we should be producing a more standardized output from plmc instead of relying on parsing log messages, but that would also add complexity to plmc so I'm not sure it's worth it.

TTTPOB commented 1 year ago

Thank you for checking that. I am totally with the structured output idea, but revert change is ok to make it just run. Should I close this pr now? @aaronkollasch

thomashopf commented 1 year ago

I just looked back at the relevant plmc PR and it seems that I missed this change to the plmc output. I'm fine with reverting that change on the plmc side.

👍

It does suggest that we should be producing a more standardized output from plmc instead of relying on parsing log messages, but that would also add complexity to plmc so I'm not sure it's worth it.

Agreed. If going for standardized output at any point, my suggestion would be to add another command line parameter that allows to save the meta output e.g. to a JSON file if requested.

aaronkollasch commented 1 year ago

I just merged the change, so this should be fixed once you pull and rebuild plmc. Let me know if not, and thanks again for the PR!