Dlux804 / McQuade-Chem-ML

Development of easy to use and reproducible ML scripts for chemistry.
5 stars 1 forks source link

Added new sklearn models and hyper-paremeter tuning for classification workflow. #85

Closed dickeygh closed 4 years ago

dickeygh commented 4 years ago

Get_Task_Type.py:

  1. Added definition of tune variable to be used in main.py.
  2. Adjusted to work with new classification models.

Classifiers.py:

  1. Redid parts of this file to allow for hyper-parameter tuning.
  2. Added extra models

features.py:

  1. Added an if statement to allow classification workflow to run.

grid.py:

  1. Added new tuning for linearSVC

name.py:

  1. Added linearSVC to algorithm_list

regressors.py:

  1. Added if statements at lines 161-164 and lines 212-216 to allow for hyper-tuning in the classification workflow.

Issues 82 and 84 are related to this PR. Please check them out and let me know of any potential solutions. However, these issues do not cause problems with this PR, as Get_Task_Type.py allows the workflow to run without these errors being raised.

pep8speaks commented 4 years ago

Hello @dickeygh! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 11:80: E501 line too long (150 > 79 characters) Line 15:80: E501 line too long (88 > 79 characters) Line 22:10: E261 at least two spaces before inline comment

Line 350:80: E501 line too long (85 > 79 characters) Line 358:9: E303 too many blank lines (2) Line 358:80: E501 line too long (109 > 79 characters)

Line 3:80: E501 line too long (99 > 79 characters) Line 5:1: E302 expected 2 blank lines, found 1

Line 46:80: E501 line too long (117 > 79 characters)

Line 188:80: E501 line too long (118 > 79 characters) Line 189:80: E501 line too long (104 > 79 characters) Line 191:80: E501 line too long (118 > 79 characters)

Line 162:80: E501 line too long (87 > 79 characters) Line 222:80: E501 line too long (87 > 79 characters)

Line 239:5: E303 too many blank lines (2) Line 248:5: E303 too many blank lines (2) Line 255:80: E501 line too long (95 > 79 characters) Line 256:80: E501 line too long (94 > 79 characters) Line 257:80: E501 line too long (99 > 79 characters) Line 266:80: E501 line too long (93 > 79 characters) Line 267:80: E501 line too long (92 > 79 characters) Line 268:80: E501 line too long (97 > 79 characters)

Line 22:4: E114 indentation is not a multiple of four (comment) Line 27:4: E114 indentation is not a multiple of four (comment) Line 114:80: E501 line too long (107 > 79 characters)

Comment last updated at 2020-09-28 18:32:28 UTC
dickeygh commented 4 years ago

Today, I pushed several changes to this PR:

Classifiers.py:

  1. Removed linearSVC

Get_Task_Type.py:

  1. Removed linearSVC
  2. Removed section for deciding if tuning should be performed.

grid.py:

  1. Removed linearSVC

models.py:

  1. Added new if statement for making tune = false for the classification models that can't tune properly.

name.py:

  1. Removed linearSVC

main.py:

  1. Removed linearSVC
  2. Changed how tune is decided so that it can be set to false or true.

Features.py:

  1. Moved lines 73-88 to lines 36-50. This is so that canonicalization will work properly for the classification workflow.
dickeygh commented 4 years ago

Today, I made the following changes to this PR:

  1. I merged it with the newly updated development branch

    Analysis.py:

  2. Used self.predictions_analysis instead so that the classification workflow would save/analyze correctly for both multi-label and single-label.

Train.py:

  1. Added a split between multi and single label, and added declaration of self.predictions_analysis to be used for single_label analysis.