alanctprado / pace2024

MIT License
2 stars 0 forks source link

Create .sh script to parallelize and evaluate test cases #11

Closed alanctprado closed 1 month ago

alanctprado commented 2 months ago

The script should run all the test cases in a directory and create a .csv file with the results.

The script should limit each test case to a 30' runtime and 8GB memory usage.

The .csv file columns should be the test case, the result (TLE, MLE or Solved), the time it took to run and the memory usage.

The script should receive as parameters the number of processes to be created, the folder in which the test cases are and the path to the output file.

We need this in order to evaluate the improvements on the solver over time.

luishgh commented 2 months ago

I think I have an initial version of the evaluation script: image

It shows the status, the elapsed real time in seconds and the maximum total memory used. The memory part needs some improvement, but what do you think about the format?

@alanctprado

luishgh commented 2 months ago

Also, I'm coming to the conclusion that using a simple text file for configuring the evaluation could be the best option, as setting everything in the command line can make things hard to reproduce (we could try the same configuration after merging improvements, for example) and it would also make the script code less cumbersome.

alanctprado commented 2 months ago

I think I have an initial version of the evaluation script: image

It shows the status, the elapsed real time in seconds and the maximum total memory used. The memory part needs some improvement, but what do you think about the format?

@alanctprado

Looks great to me!

alanctprado commented 2 months ago

Also, I'm coming to the conclusion that using a simple text file for configuring the evaluation could be the best option, as setting everything in the command line can make things hard to reproduce (we could try the same configuration after merging improvements, for example) and it would also make the script code less cumbersome.

You mean with the flags used, etc? Sounds great actually. Can you work on this?

luishgh commented 2 months ago

@alanctprado Just finished changing the evaluation script to this new format. The only thing that still bugs me is how should we set the sleep duration for waiting to launch new jobs (I have implemented this limit parameter)? I guess something like $1/10$ times the TL? Waiting for the whole TL seems sub-optimal.