jvalegre / robert

Automated machine learning protocols that start from CSV databases of descriptors or SMILES and produce publication-quality results in Chemistry studies with only one command line.
MIT License
33 stars 5 forks source link

Predictors #49

Closed Exlonk closed 1 day ago

Exlonk commented 1 day ago

I want only the descriptors so I use

python -m aqme --csearch --program rdkit --input CSV_NAME.csv --sample 50

python -m aqme --qdescp --files "CSEARCH/*.sdf" --program xtb --csv_name CSV_NAME.csv

In the second step, I had to change the name of the column 'smiles' to 'SMILES'

bug

Question: When I try to get only the descriptors I had to put the column "target" it constructs the predictors in function of the target?

Question: I try to use only the machine learning model with my own csv that I get from the previous step, some colums were lists and it didnt work, if I use only a csv file I can only use columns with numerical values?

ddgunizar commented 1 day ago

So right now you have two ways of obtaining the descriptors:

In response to the second question:

jvalegre commented 1 day ago

Just to add up from the previous comment - AQME v1.6.1 (probably the version you're using) leaves columns with lists corresponding to atomic descriptors after running QDESCP. In the new version 1.7.0, these lists aren't created and it's easier to create descriptor databases. This v1.7.0 is currently available on GitHub and it should be soon available on pip/conda as well.

BTW please follow the standard templates for issues when possible, that way we can help users more efficiently (since we have versions, installation info, etc)

Exlonk commented 1 day ago

Your are very kind for your answers thanks a lot