Closed IchiruTake closed 2 years ago
@pstjohn I have completed the first step preparing module utils.py
as a replacement for fragment.py
. Can you take a look at it ? If everything going well, should I push all calling methods from different files such as prediction.py
and model.py
using the module utils.py
?
@pstjohn Can you review the code for me? What I did is to:
utils.py
as a replacement of fragment.py
to improve the pipeline speed. The fragment.py
is left there for legacy purpose.arg::MODEL_CONFIG
in __init__.py
@pstjohn I have corrected the failed test. Can you review it again ?
Just so I'm following, is the computational speed improvement here mainly de-duplicating inside the fragment code, rather than after all the fragments have been generated? That might be more simply implemented as a line or two here: https://github.com/NREL/alfabet/blob/master/alfabet/fragment.py#L11-L12
A couple style notes: It's standard to use
snake_case
for variable and function names, andTitleCase
for class definitions. Internal class functions usually just start with a preceding underscore (and don't have a trailing one) i.e.,_my_internal_function
.
I will correct the function as your style. Yes you are right, but it does not stop there, the optimization of de-duplicating is important but what I want is to centralize all molecules processing with a centralized class to avoid any memory leak, possible parallelization and reduce duplicated operation. For example, the conversion of molecules from SMILES and AddHs function which return new molecule. Even if they are written in C++, the time and temporary memory is non-trivial with less yime fluctuation. The web-based is the best example to get this advantage, deduplicate fragments benefits large dataset processing
Moreover, I tried to reduce the namespace of ALFABET by taking only necessary functions which is definitely faster with narrower import. I would fix it in within this week as in here, it is the midnight.
Furthermore, we want to gain more control on the molecule processing, which you can see in the method MolProcessor::GetReaction()
and reduce future refactoring.
All in all, thank you for your review.
Are you able to run the test suite locally? Looks like some of those tests are still failing
Are you able to run the test suite locally? Looks like some of those tests are still failing Let me check that if the installation behaves well
Are you able to run the test suite locally? Looks like some of those tests are still failing
The installation worked fine and call test_predict()
does not raise assertion. I think it would be better to cast the current state of utils.py
to fragment.py
for better maintenance. I will try to make code more consistently to your coding style. If there are anything else, you can response to me immediately through Linkedin for faster response.
Speed the model start-up by setting arg::compile=False
on empty configuration model.
Linkedin: https://www.linkedin.com/in/minh-pham-hoang-0b0626172/
This PR attempted to refactor the source code by:
The tests worked well. Do you have any comments? @pstjohn
@pstjohn Please review for me when I am processed and afterward. Do you mind if the function convert to the class object for better the state of the molecule is maintained.
Related: #2