ReaLLMASIC / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
23 stars 17 forks source link

Modified train.py to enable plotting of input/output statistics for constantmax #163

Closed Hrancheng closed 3 months ago

Hrancheng commented 3 months ago

Added one argument for users to choose input/output statistics to display, currently only for constantmax, will look into whether the same code can be extended to other softmax variants. usage: logging_group.add_argument('--statistic', choices=[ 'input_mean', 'input_median', 'input_stdev', 'input_max', 'output_mean', 'output_median', 'output_stdev', 'output_max' ], default='input_mean', help='Select the statistic and type to display, example: input_mean, output_max')

Also, enabled denominator to be printed to CSV: self.write_to_csv(self.iter_num, i_sum_vals, i_means, i_medians, i_stdevs, i_max_values, denominator, prefix="inputs")

gkielian commented 3 months ago

@Hrancheng Seems the CI tests require a few more dependencies, now that visualization is added to train.py.

I added a pr for updating the requirements_cpu.txt which hopefully will resolve the CI tests by adding installation of seaborn, plotly and pandas.

https://github.com/Hrancheng/nanoGPT/pull/2