zen0wu / topcoder-greed

greedy editor for topcoder arena
Apache License 2.0
229 stars 48 forks source link

More flexible input format for filetest #135

Open wookayin opened 10 years ago

wookayin commented 10 years ago

This might be a new idea; let's just discuss on it.

Currently (as of 2.0 RC) the filetest template organizes the test examples into {ProblemName}.sample file. Each parameter values are parsed line-by-line; I suggest make the input format in a much flexible way.

For example, suppose there are 4 int[]s for input, each of them having 50 elements. Then we might have

50 // this the length of arr0
1
2
... (46 lines)
49
50
50 // this is the length of arr1
1
2
... (46 lines)
49
50
(... and repeat two more)

, which results in almost 200 lines. This is not easily maintainable.

Instead, we can do in a much compact way such as:

I'd prefer the second way (because such lines can be parsed when testing the code, in the arena).

For the string[] parameters, however, one would prefer the grid-style sometimes. {"NYY", "NYN", "NNN"} is hard to read, compared to:

NYY
NYN
NNN

Currently, the input format requires that each element (in an array) should placed in a single line -- relaxing this constraints (i.e. don't care whether will also be nice. In conclusion, I think it would be great if the tester could accept all of these formats, with a great flexibility.

Do you think it is a breaking change? Is it OK that the behavior or the generated samples can be changed before the new major release (the compatibility issue) ? If not, I expect this can be a new feature for the next minor release (maybe, 2.1?).

Please give me your feedbacks. Thanks!

zen0wu commented 10 years ago

The design of per-line input is a simple but effective way to work out all the special cases, especially for the string input. The problem with more readable data formats is that, for the data to be readable, you have to put extra characters, and extra characters lead to confusions since if the input is string, it can be anything, including the extra characters.

So we have to consider what if the string input contains comma or double quote in the topcoder style, and one drawback of this style is that it makes user harder to write their testcase, especially in a time-urgent situation like real-time contests.

The different style options is a great idea, we can provide different style of data and let the user choose, which however will make our efforts much larger.

What I was thinking, before the release of 2.0 beta, is that, we can extract the common test code into a tester lib, by defining a universal data format. Imagine using universal data format like XML, then what the user get is a concise and clean code with a few extra calls to the tester lib, and the tester lib is shared between all problems. Right now we are generating testcase parsing code according to the data types, so what we can improve is to move the parsing into the data, by prepending the format definitions in the datafile, or something like that.

So we can design a new testcase format, which reaches the best balance, between user entering, and readability, and can be universally used across different languages and problems, by also defining the data types.

@vexorian Your opinion please?

long-long-float commented 9 years ago

I want "ICPC-like style". If there is a problem with string input, I propose that it should be enabled only with integer input.