JAQPOT Quattro is the 4th version of a YAQP, a RESTful web service which can be used to train machine learning models and use them to obtain toxicological predictions for given chemical compounds or engineered nano materials. The project is written in Java8 and JEE7.
Create descriptor calculation services based on the following principles:
Descriptor calculation services reply on the JPDI architecture
There are descriptors are can be parametrized, e.g., they are function of both the
substance structure and some external parameters (optionally)
The implement the algorithm/model schema: a descriptor calculation algorithm is
invoked to create a descriptor calculation model (POST to /algorithm/{id}); then
this creates a model and a POST to /model/{id} creates a dataset with the
requested descriptors
When creating a descriptor calculation model, the features for this model are also
created (by jaqpot). Let me give an example as follows: Let's say we create a model
using the algorithm /algorithm/cdk with parameters topological=true. This will create
a model which will calculate only topological descriptors. The procedure is as follows:
The client POSTs to /algorithm/cdk the parameters topological=true
jaqpot forwards this request to the respective JPDI service
The JDPI service returns a response in which it specifies that it is going to
compute the following descriptors: wiener, hosoya, estrada, randic, etc...
jaqpot looks up into the database for features that correspond to these
descriptors with queries like "Find all descriptors which have source
the algorithm=/algorithm/cdk and have title=wiener".
The model is generated and acquires an ID, e.g., /model/abc3d
Notice that when creating a descriptor calculation model, we don't need to send
the dataset to the JPDI service. The dataset will be sent to the JPDI service
afterwards when we will need to actually calculate the descriptors.
In order to calculate descriptor valued, the client has to POST a dataset with substances (structures, e.g., SDF, SMILES etc) to a descriptor calculation model.
Descriptor calculation algorithms and models have to be properly annotated.
What we need to do as a first step is to specify the form of input-output entities (DTOs).
As I understand it, what we need for starters is an additional field in the JPDI TrainingResponse, a List predictedFeatures, that an algorithm service can use to specify the titles of the features it intends to create. Then Jaqpot should check if features with the specified titles for that algorithm exist, if not create them as proper features and store their ids as predictedFeatures inside the newly created model.
Then when prediction is done, the algorithm service will provide the same titles for those features in the JPDI PredictionResponse, then Jaqpot would retrieve the features from the model and check their titles and know which column goes on which feature.
Is that correct?
Create descriptor calculation services based on the following principles:
/algorithm/{id}
); then this creates a model and a POST to/model/{id}
creates a dataset with the requested descriptors/algorithm/cdk
with parameterstopological=true
. This will create a model which will calculate only topological descriptors. The procedure is as follows:/algorithm/cdk
the parameters topological=truewiener
,hosoya
,estrada
,randic
, etc.../algorithm/cdk
and have title=wiener
"./model/abc3d
What we need to do as a first step is to specify the form of input-output entities (DTOs).