samplchallenges / SAMPL6

Challenge inputs, details, and results for the SAMPL6 series of challenges
https://samplchallenges.github.io
MIT License
52 stars 32 forks source link

logP analysis with reassigned method catergories #95

Closed MehtapIsik closed 5 years ago

MehtapIsik commented 5 years ago

New logP analysis directory for reassigned method categories: https://github.com/samplchallenges/SAMPL6/tree/logP_analysis5/physical_properties/logP/analysis_with_reassigned_categories

  1. Physical category divided into Physical (MM) and Physical (QM).

  2. Methods from Other category were reassigned to Empirical and Physical (QM) 7dhtp: OTHER --> EMPIRICAL arw58: OTHER ---> QM 4p2ph: OTHER ---> QM rs4ns: OTHER ---> QM 5t0yn: OTHER ---> QM c7t5j: OTHER ---> QM jc68f: OTHER ---> QM hsotx: OTHER ---> QM fe8ws : OTHER ---> QM

  3. Method map file has a table with "category" column showing participant assigned method categories and "reassigned_category" column showing the new method categories we will use in the paper.

  4. Null prediction added with submission ID NULL0. It predicts logP as 2.66 according to mean clogP of oral FDA-approved NCEs of 1998-2017 period, according to this paper: https://doi.org/10.1021/acs.jmedchem.8b00686 I assigned it to the Empirical category and method name is "mean clogP of FDA approved oral drugs (1998-2017)".

  5. I changed barplot colors to colorblind-friendly Zesty color palette. image Hex color codes: "#0F2080", "#F5793A", "#A95AA1", "#85C0F9"

All categories: /analysis_with_reassigned_categories/analysis_outputs_withrefs/StatisticsTables/RMSE_vs_method_plot_colored_by_method_category.pdf QM vs MM: /analysis_with_reassigned_categories/analysis_outputs_withrefs/StatisticsTables/RMSE_vs_method_plot_physical_methoods_colored_by_method_category.pdf

  1. I adjusted figure sizes and fonts as much as possible for the plots that will go to paper figures. The remaining adjustments such as making sure axis labels are in the figure frame and cutting white space can be done in Inkscape.

  2. I added ridge plot of prediction error distribution for each molecule. For all methods including REFs: https://github.com/samplchallenges/SAMPL6/blob/logP_analysis5/physical_properties/logP/analysis_with_reassigned_categories/analysis_outputs_withrefs/MolecularStatisticsTables/molecular_error_distribution_ridge_plot_all_methods.pdf 7 consistently well-performing methods: https://github.com/samplchallenges/SAMPL6/blob/logP_analysis5/physical_properties/logP/analysis_with_reassigned_categories/analysis_outputs_withrefs/MolecularStatisticsTables/molecular_error_distribution_ridge_plot_well_performing_methods.pdf

bergazin commented 5 years ago

If ridge plots are used now then there should be one for the extra molecules too

MehtapIsik commented 5 years ago

Thanks, Danielle. I added the ridge plot for extra molecules: https://github.com/samplchallenges/SAMPL6/blob/logP_analysis5/physical_properties/logP/analysis_of_extra_molecules/analysis_outputs_withrefs/MolecularStatisticsTables/molecular_error_distribution_ridge_plot_all_methods.pdf

I updated the colors of the method category comparison by molecule: https://github.com/samplchallenges/SAMPL6/blob/logP_analysis5/physical_properties/logP/analysis_of_extra_molecules/analysis_outputs_withrefs/MolecularStatisticsTables/molecular_MAE_comparison_between_method_categories.pdf

davidlmobley commented 5 years ago

@MehtapIsik the link in the PR notes to the "method map file" is broken; which file is this?

MehtapIsik commented 5 years ago

It is here: https://github.com/samplchallenges/SAMPL6/blob/master/physical_properties/logP/predictions/SAMPL6-logP-method-map.csv