atfrank / Metallo

A Machine Learning Tool for Classifying Magnesium Binding Sites in RNA
0 stars 0 forks source link

Inconsistency between the features generated by the current Metallo and the ones in train/data/fingerprint #2

Open zlcrrrr opened 3 years ago

zlcrrrr commented 3 years ago

Hi Jingru,

I wonder if the classifier was trained using the features in train/data/fingerprint?

I found that the features generated by the current Metallo is inconsistent with the feature in train/data/fingerprint. I am using the script in the "sh" folder. For example, 1d4r, the current Metallo generates 535 dummy atoms, however, train/data/fingerprint/1d4r.txt only contains 25 dummy atoms.

I suspect the features in train/data/fingerprint were generated by an earlier version of Metallo, which was committed by me on Feb 7, 2020, [https://github.com/atfrank/AtomicFeaturizer/tree/5e026fa22b26451b5f5fca1a9fde35e369e2ad6f]. In that version, Metallo still retains crystal water, and uses them as a constraint to generate dummy atoms (I think the current version also retains crystal water, but the water distance upper bound constraint is relaxed to 10000 Å). The related line is Line 315 in the current and the earlier metallo.cpp.

Thanks, Lichirui

atfrank commented 3 years ago

Either way, for the manuscript, Metallo should only be using the RNA coordinates to generate dummy atoms. For the use case we have in mind, the user only uploads RNA coordinates.

On Fri, Sep 17, 2021 at 2:51 PM Lichirui Zhang @.***> wrote:

Hi Jingru,

I wonder if the classifier was trained by the features in train/data/fingerprint?

I found that the features generated by the current Metallo is inconsistent with the feature in train/data/fingerprint. I am using the script in the "sh" folder. For example, 1d4r, the current Metallo generates 535 dummy atoms, however, train/data/fingerprint/1d4r.txt only contains 25 dummy atoms.

I suspect the features in train/data/fingerprint were generated by an earlier version of Metallo, which was committed by me on Feb 7, 2020, [ https://github.com/atfrank/AtomicFeaturizer/tree/5e026fa22b26451b5f5fca1a9fde35e369e2ad6f]. In that version, Metallo still retains crystal water, and uses them as a constraint to generate dummy atoms (I think the current version also retains crystal water, but the water distance upper bound constraint is relaxed to 10000 Å). The related line is Line 315 in the current and the earlier metallo.cpp.

Thanks, Lichirui

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/atfrank/Metallo/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABM6IQB4VOZ2WW32ZFMKTYDUCOE4LANCNFSM5EIEOA7Q .

-- Aaron T. Frank (he/him) Assistant Professor of Biophysics Board of Reviewing Editors, eLife Advisory Editorial Board, Biophysical Chemistry Early Career Board (ECB), Journal of Chemical Information and Modeling

University of Michigan 3000 Chemistry, 930 N. University Ave. Ann Arbor, MI, 48109 | (734) 615-2053

https://sites.lsa.umich.edu/frank-lab/ https://twitter.com/aFrankLab