The list of categories is still the same as from the repository's inception (2022-06). But the distribution now shows that it doesn't adequately categorize the atomistic ML (AML) projects it features. Two examples. 1) The two largest categories, ML-IAP and Rep-Learn, have 65 and 55 projects, respectively, and thematic overlap. 2) Some of the smallest categories, like Active learning and XAI, have less then five projects, and moreover, no significant growth for at least one year.
[ ] Resolve all categories with a) less than ~5 projects AND no significant growth for >1 year and resort each one either a) into the next-best fitting, still existing category, or b) if no existing category fits, into the new category "Miscellaneous". Add their former category as label wherever it is missing.
Candidate categories: TODO
Selected categories: TODO
Reasoning: TODO
Done in COMMIT: TODO
[ ] Merge ML-DFT, ML-ESM, ML-WFT into new category "Machine learnin of first-principles observables" (ML-FPO). Add their former category as label wherever it is missing.
Done in COMMIT: TODO
[ ] Come up with a solution for the largest category set ML-IAP, MD, Rep-Learn. How could this be broken up into more digestible categories, so that the project distribution is more even?
Category details:
Currently, the distribution of projects among categories is very uneven.
The list of categories is still the same as from the repository's inception (2022-06). But the distribution now shows that it doesn't adequately categorize the atomistic ML (AML) projects it features. Two examples. 1) The two largest categories, ML-IAP and Rep-Learn, have 65 and 55 projects, respectively, and thematic overlap. 2) Some of the smallest categories, like Active learning and XAI, have less then five projects, and moreover, no significant growth for at least one year.
Additional context: