Question on AI-Feynman training speed

I came across your wonderful work in the Medium article. I consider myself a rookie when it comes to machine learning or AI, and as such have some beginner's questions on the AI-Feynman system:

Is there a a bottleneck on the training speed? For example, is the code optimized to run on CUDA or parallel GPU/CPU? I noticed the presence of PyTorch and am wondering do GPU(s) make a difference? If not, can the training be distributed parallel across CPUs?
If looking at application for machine learning datasets, what would be a realistic/practical limit to the data size or dimension for input? For example, can this be applied to find a regression equation for target y using dataset with 1000 features (i.e. X1, X2....X1000)? Or should some sort of feature selection or dimensional reduction be applied to reduce number of features to a reasonable or manageable amount?
What would be best practice to apply this to a machine learning dataset? For example, the Boston Housing Price example dataset used in regression tasks?

Thank you for your time. I have not run the algo myself because I'm on Win10 and looks like this may be a problem.

SJ001 / AI-Feynman

Question on AI-Feynman training speed #20