Closed bencoman closed 2 years ago
Further to this, it seems that description
ends up trying to serve two purposes:
and ends up serving neither well. It constrains the length of method comments and encourages too-long method names.
Reading more about the schema... https://github.com/exercism/problem-specifications/blob/master/canonical-schema.json
and also that "Comments ... can be used almost anywhere" https://github.com/exercism/problem-specifications/blob/master/README.md#test-data-format-canonical-datajson
to fit current schema perhaps it would be reasonable for me to shorten the description
like this...
{
"description": "roll bonus balls before scoring"
"comments":
[ "Bonus balls for a strike in the last frame must be ",
, "rolled before score can be calculated" ],
"property": "score",
"input": {
"previousRolls": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10]
}
[EDIT:] In practice this seems to be turning out to be a bad idea. I now expect there will be a lot of push back trying to slim down the descriptions, and conversation will just bog down going in circles trying to reconcile dual purposes (description and identifier) from the one field. This PR for example... https://github.com/exercism/problem-specifications/pull/1525
Using your example as an example one might consider it easier to just adjust the ordering in the sentence, to simplify it and thus shorten it:
"description": "Score calculated after bonus rolled strike in last frame"
This new description retains the meaning of the original description. There is no guarantee that the comments will translate into a track's test suite. The comments are there for the maintainers benefit in building the test suite for their track.
Instead of reworking the schema for the canonical data I think the first step ought to be seeing if it is possible to refactor the "offensive" descriptions to be less offensive, to be more concise but without making the language too sophisticated. For many people English is not their first language, so one has to be careful to not make the language too efficient so as to be difficult for a non-native English speakers to understand easily.
Don't take what I have said as meaning I am against changing the schema, quite the opposite. I just don't think changing the schema is necessarily the solution we need to rush too.
Instead of reworking the schema for the canonical data I think the first step ought to be seeing if it is possible to refactor the "offensive" descriptions to be less offensive, to be more concise but without making the language too sophisticated.
This would also be my preference. The example you presented (bowling), is one of the test suites with the longest descriptions. So maybe if we can shorten those, the problem goes away?
We can certainly see if we can improve the descriptions to generate better. Also worth noting that some test cases have nested descriptions - e.g. and often these repeat what the parent description say vs. refining it. Triangle is an example of this and it should be corrected (along with checking all the other sub-cased examples).
(the geek in me did think it would be fun to take a munged description and hit some web service to summarise it - but that's where the angel on my other shoulder shouts....stoppppp)
I feel like modifying the descriptions to serve the purpose of being an identifier is like forcing a square peg into a round hole by cutting off the square corners.
For many people English is not their first language, so one has to be careful to not make the language too efficient so as to be difficult for a non-native English speakers to understand easily.
What is "too efficient" is highly judgmental with a high-bike-shedding-factor. I fear slimming down descriptions trying to make them more identifier-like will burn everyone's time better spent on other things.
What is "too efficient" is highly judgmental with a high-bike-shedding-factor. I fear slimming down descriptions trying to make them more identifier-like will burn everyone's time better spent on other things.
I think this is a fair assessment. Perhaps we can take this new field under consideration? cc @ErikSchierboom, #1496
Somehow I'm a bit hesitant to include this new field, even though I understand the reasoning behind it. My main argument against it is that the current setup actually works fine for virtually all exercises/tracks. Why then change it? As far as I understand, you want to include the description in the code, which I think is something that tracks normally don't do (it is not wrong of course, just unusual).
Let's gather some more opinions to see if I'm the odd man out here. @iHiD @kytrinyx @rpottsoh @petertseng what do you feel about the proposal to add a new "identifier"
property that contains the name of the tested method, instead of using the "description"
property?
FWIW I'm inclined to agree with @ErikSchierboom: the description field works well enough. Adding an identifier
field is not necessary.
Context: the Rust generator does pretty well with a very simple function to convert descriptions to identifiers:
The advantage of adding an identifier
field which would overwrite such description-based identifier munging isn't obvious to me. Are there languages for which humans must type out the full identifier of a test case to run? If not, who cares how long they are?
The disadvantage is that if we add the new field, it imposes a maintenance burden. All tracks' test generators must handle the field, or be in (harmless-for-now) noncompliance with the canonical data spec. Everyone writing a PR updating the canonical data must decide whether or not to include the optional field. All the reviewers have to decide whether or not it's worth dogshedding it in this case. In my opinion, the cost of this maintenance burden outweighs the benefit of sometimes having cleaner identifiers.
@bencoman 's main concern I think is that the cascading descriptions as is are huuuuuge. Their language (Pharo (?)) I think is a visual one, so these huuuuge names show up all over and are not actually very "workable".
I think he's trying to minimise the bikeshedding by having both: a name and a description.
I think he's trying to minimise the bikeshedding by having both: a name and a description.
And my argument is that instead of doing all this bikeshedding, why not just use shorter descriptions?
And my argument is that instead of doing all this bikeshedding, why not just use shorter descriptions?
Bingo. My https://github.com/exercism/problem-specifications/issues/1473#issuecomment-468693534 from the beginning of March.
For those of us unfamiliar with Pharo (me) perhaps a screen shot or something that illustrates how/why long descriptions work in a negative way in that language / development environment. If that has been provided somewhere else then please supply a link.
It is easier to say no to any changes than it is to say yes when it isn't easy to appreciate what the issue is being faced.
Here are a couple of generic Pharo workflow videos for context...
and here is my Exercism specific example...
Context: the Rust generator does pretty well with a very simple function to convert descriptions to identifiers
The first sample I checked... "description": "instructions to move east and north"
that rust function produces "instructions_to_move_east_and_north"
which differs from "follow_instructions_to_move_east_and_north" of the matching rust exercise...
So I'm curious how the word "follow" was introduced by the rust generator.
Also that function doesn't help generate sufficiently short strings to make a good identifier. For example...
pub fn format_exercise_description(description: &str) -> String {
description
.chars()
.filter(|c| c.is_alphanumeric() || *c == ' ')
.collect::<String>()
.replace(" ", "_")
.to_lowercase()
}
fn main() {
let description = "Where R = Turn Right, L = Turn Left and A = Advance, the robot can follow a series of instructions and end up with the correct position and direction";
println!("{}", format_exercise_description(description));
let description = "A dart whose coordinates sum to > 1 but whose radius to origin is <= 1 is scored in the inner circle";
println!("{}", format_exercise_description(description));
let description = "instructions to move east and north";
println!("{}", format_exercise_description(description));
}
produces...
where_r__turn_right_l__turn_left_and_a__advance_the_robot_can_follow_a_series_of_instructions_and_end_up_with_the_correc
t_position_and_direction
a_dart_whose_coordinates_sum_to__1_but_whose_radius_to_origin_is__1_is_scored_in_the_inner_circle
instructions_to_move_east_and_north
The advantage of adding an identifier field which would overwrite such description-based identifier munging isn't obvious to me. Are there languages for which humans must type out the full identifier of a test case to run? If not, who cares how long they are?
That is off point. The full identifier doesn't need to be typed in Pharo where individual tests are run with a single click. The issue is super long method names being awkward in the navigation pane.
The issue is super long method names being awkward in the navigation pane.
Hence my @rpottsoh' suggestion to just shorten the descriptions (which really are excessively long). Once again, why should we add a new field that would require maintenance work for many people, for a problem that appears to only really be problematic in one language track and for which there is a relatively easy fix (shorten the descriptions)?
p.s. in the time we spent discussing this new field we could have done enough bikeshedding to come up with suitably short descriptions I feel...
As far as I understand, you want to include the description in the code,
Including existing descriptions in generated code comments is a secondary concern.
My primary concern is reducing the length of identifiers used for naming test methods.
which I think is something that tracks normally don't do (it is not wrong of course, just unusual).
At least @exercism/scala seems to concatenate nested descriptions into comments. and @exercism/php nicely nests descriptions as comments and may be impacted by shortening the description field.
Hence my @rpottsoh' suggestion to just shorten the descriptions (which really are excessively long).
Yeah I'm also trying that path. Some support there would be nice.
p.s. in the time we spent discussing this new field we could have done enough bikeshedding to come up with suitably short descriptions I feel...
Instead the discussion at PR #1525 will be repeated a dozen times - for each exercise needing shorter descriptions.
I think this comes down to what is the true intent of having problem descriptions in a machine readable format? Why not have an ambiguous descriptive text description and be done with it?
I had understood it was to ease automatic generation of consistent descriptions across languages - at least the intent of tests could be consistent across languages regardless of implementation. You could read a test name and understand what you had to do.
This is bollocks - it’s badly conceived and should be sorted . The descriptions have grown organically without much thought to autogeneration and we are left to muddle through?
This isn’t a language problem or ide problem at all!
Can someone clearly in pseudo code tell me how to unambiguously generate clear and concise test names for all the exercises without having to adjust anything? Ideally so I can read the same test name in any language and compare the unique implementation in that language to any other?
If so, we will change Pharo to that - but I suspect it’s not the case, we’ve all worked around this nonsense and it’s a damn shame. It takes the steam out of wanting to improve this in my opinion , I’ve certainly backed away - I’ll solve a few more interesting problems with the descriptions we’ve got and maybe create a few custom exercises but it saps too much energy to correct things up stream in my opinion.
The interesting thing was that problems had a definition, but this looks rather accidental from my experience so far.
The pseudo code should be as simple as:
for every "case" (descending in pre-order) if a top level case and language ide sorts by name, prepend a "counter prefix" (e.g. testxx_) append the adjusted case description* end for
*adjusted description: if the description contains <,>, >= etc (convert to expanded descriptive text like "greater than") if the description contains invalid language method name characters strip them out (eg. &, + etc)
the following additional rules are a bit empirical (and shouldn't be needed if readability was considered when nesting cases)
if the description is all digits prepend "at " if the description begins with a digit prepend "and "
- The issue is typically nested cases where the author hasn't thought about a generation algorithm (possibly the above) and needlessly repeats information, so concatenation is repetitive and overly verbose.
Beyond a certain size, I don't think students actually read the test name, or struggle to find it useful (regardless of the language or IDE - so you have to manually customise it)
We've painfully corrected some specifications, there are still many that are sub optimal (look at robot-simulator, which starts out ok and then gets a bit silly)
In this comment, in the first section I share my view of our current purpose for descriptions (maintainer-centric, intended to help us evaluate coverage of test suites). The second is to examine the effect that some of these choices (nesting, descriptions) have on various test suites (it appears the original design was only made to support test suites that have nesting. It should be decided what sort of test suites problem-specifications is meant to encourage and make easy).
I am sorry to say that because of travel, I currently do not personally have a recommendation to provide, but I hope some information helps to understand what the elements of a desirable recommendation would be.
First, regarding the intent of descriptions:
I think this comes down to what is the true intent of having problem descriptions in a machine readable format? Why not have an ambiguous descriptive text description and be done with it?
I had understood it was to ease automatic generation of consistent descriptions across languages - at least the intent of tests could be consistent across languages regardless of implementation. You could read a test name and understand what you had to do.
I would say I agree with that understanding. I'll give my own take on the previously-quoted README, the quote being:
the description should not simply explain what each case is (that is redundant information) but also why each case is there. For example, what kinds of implementation mistakes might this case help us find?
... actually I'm almost obligated to give my own take on this, since any examination of the history would show that I wrote those sentences in https://github.com/exercism/problem-specifications/commit/add5995203cabe3c7d16313de5bd90c092b66843#diff-04c6e90faac2675aa89e2176d2eec7d8, following the discussion in https://github.com/exercism/problem-specifications/pull/451#issuecomment-263070019.
We have a desire, as collective maintainers of the various canonical-data.json, to retain information about why each test case exists, so that we as maintainers can evaluate what coverage we might lose if a test case is proposed to be removed. What mistaken solutions might now be allowed if we remove a test case?
For the sake of coverage of the test suites, I suggest that it is useful to keep that information somewhere, but if the allegation, discussion, and subsequent consensus is that the description is not the right place for it, then perhaps it is time to find a better place for it.
So, given this information we have, with each test case given a description and possibly nested in other groups of cases each with their own description, can we predict how a track's maintainers might choose to render these cases in the track's choice of test suite?
It is hard to say, but I have a suspicion the current structure was written in mind with test suites that support some form of nesting/grouping for their tests. As an example, I show the robot-simulator for Haskell (note that Haskell's tests were not generated as far as I can tell, but it is still a fine example to show the nesting): https://travis-ci.org/exercism/haskell/jobs/542405000#L3389-L3435. Only the leaves (items without further items grouped under them) are individual test cases.
For convenience it'll also be reproduced at the bottom of this comment.
I would say that since guidelines on nesting and descriptions implicitly only considered this style (I don't believe anything was discussed), other styles were left by the wayside.
So I would say something useful is to decide what sorts of test suites the canonical data is meant to encourage. What sort of information should be shown to the student trying to solve the exercise. Based on that, then it is possible to come up with guidelines on what sorts of nesting and descriptions should be done. Without knowing what we are trying to support and encourage, it is difficult to make recommendations if we don't understand what the goal is.
Nested descriptions, reproduced here for convenience (but unfortunately without the proper colour-coding shown in the Travis CI link)
mkRobot
A robot is created with a position and a direction
Negative positions are allowed
turning right rotates the robot's direction 90 degrees clockwise
from North
should change direction
shouldn't change position
from East
should change direction
shouldn't change position
from South
should change direction
shouldn't change position
from West
should change direction
shouldn't change position
turning left rotates the robot's direction 90 degrees counter-clockwise
from North
should change direction
shouldn't change position
from West
should change direction
shouldn't change position
from South
should change direction
shouldn't change position
from East
should change direction
shouldn't change position
advancing
North from (0,1)
shouldn't change direction
increases the y coordinate one when facing north
South from (0,-1)
shouldn't change direction
decreases the y coordinate by one when facing south
East from (1,0)
shouldn't change direction
increases the x coordinate by one when facing east
West from (-1,0)
shouldn't change direction
decreases the x coordinate by one when facing west
move
instructions to move east and north from README
instructions to move west and north
instructions to move west and south
instructions to move east and north
@petertseng thanks for a thoughtful reply (and apologies for my frustration). Your explanation clarifies many things, and I think it does confirm that we should offer some pseudo code for track authors about how they can approach generation. A reference implementation that goes as far as spitting out test descriptions and pseudo code for a test - would go a long way to helping everyone improve the current specifications as well as think about what to write for future specs.
The robot example you give, is much better that the current specification (and more in line with what I would expect to generate. The current one has (and there are lots of other spec like this too):
"Where R = Turn Right, L = Turn Left and A = Advance, the robot can follow a series of instructions and end up with the correct position and direction", "cases": [ { "description": "instructions to move east and north from README",
If my shortened descriptions
in #1525 are accepted, I'll ease off on this issue and plug away at other shortenings, although I still think its the wrong way to go.
The descriptions have grown organically without much thought to autogeneration and we are left to muddle through
I'm not sure how it aligns with the growth of the long descriptions, but I noticed that...
description: which will be used to name each generated test
was introduced four years ago and removed two years ago There is now only the following which encourages verbosity unsuited to identifier generation:
the description should not simply explain what each case is (that is redundant information) but also why each case is there. For example, what kinds of implementation mistakes might this case help us find?
So it seems the description
field has evolved over time and ended up with a mixed purpose pulling it in two directions - short versus long.
What sort of information should be shown to the student trying to solve the exercise
I myself got confused about the purpose of the "comment" field. In a greenfields scenario a better choice might be "maintainer_notes" and "student_notes" - but thats easy to see the distinction in hindsight.
Even though it duplicates/summarises the description.md , the following might fit well in "student_notes".
Where R = Turn Right, L = Turn Left and A = Advance, the robot can follow a series of instructions and end up with the correct position and direction"
I myself got confused about the purpose of the "comment" field. In a greenfields scenario a better choice might be "maintainer_notes" and "student_notes" - but thats easy to see the distinction in hindsight.
This is indeed a bit confusing.
So it seems the description field has evolved over time and ended up with a mixed purpose pulling it in two directions - short versus long.
Well spotted. I don't know why we changed this. Anybody has any idea?
Well spotted. I don't know why we changed this. Anybody has any idea?
description
shows some of the arguments).I myself got confused about the purpose of the "comment" field. In a greenfields scenario a better choice might be "maintainer_notes" and "student_notes" - but thats easy to see the distinction in hindsight.
Changing the field name would be a breaking change, but I think this could definitely be clarified in the README.
@bencoman wrote:
So it seems the description field has evolved over time and ended up with a mixed purpose pulling it in two directions - short versus long.
I didn't know this. This is a very reasonable concern.
To grow on your idea, perhaps rather than an "identifier": "roll_bonus_balls_before_scoring"
one could also have a "name": "Roll bonus balls before scoring"
to have tracks generate whatever snake-case or camel-case identifier they need.
If my shortened descriptions in #1525 are accepted, I'll ease off on this issue and plug away at other shortenings, although I still think its the wrong way to go.
While this issue does not affect any tracks I am involved with personally, in the interest of moving forward with #1492, I would like to know from everyone who is still involved in the discussion of an "identifier" field how they feel about taking another iteration of this proposed schema change:
"identifier"
or "name"
or "short_description"
or what?"description"
once did cf. @bencoman's digging?We may end up with this feature and we may not.
But I'd like for it to 1) be designed well, if at all, and 2) not block #1492.
Even if its useful to group addition of the other keys together, its also useful to concentrate on a single case for the first one.
I'm having some progress with the alternative of slimming down the descriptions, but although the feedback is worthwhile its slow to work through. It end up this is not required, but I'm not yet ready to let go of it. I think #1492 should proceed independent of this one, otherwise getting this one right would unnecessarily hold up #1492.
btw, By "identifier" I mean http://aboutc.weebly.com/identifiers.html not UUID. Perhaps "name" would better distinguish the intent, but not "short_description".
An absolute limit isn't required, but some guidance via a "recommended" length would be useful to get everyone on the same page. What conventions do people use for manually writing individual unit tests? btw, here is an analysis of test methods (~20,000) in Pharo's own system code...
Make this an optional key in canonical data. Generators can fall back to their existing generation from "description". Indeed some languages don't generate individual methods/functions for each test case, so would this key would not apply to them.
Are there still opponents to the idea of two separate description fields who think we can determine when a description fits in the middle?
Raises hand I still believe that we can shorten almost all descriptions enough to have reasonable length identifiers. Note that if I am the only one objecting, I'd of course get in line and follow the majority.
I do wonder what @kytrinyx's and @iHiD's opinions are on this matter.
I also believe that we can shorten almost all descriptions enough to have reasonable length identifiers.
But descriptions will diverge and become long again unless addressed.
Can the following statements highlighted by @bencoman both be true?
description: which will be used to name each generated test
the description should not simply explain what each case is (that is redundant information) but also why each case is there. For example, what kinds of implementation mistakes might this case help us find?
That is, a description that can be used to name each test that explains what each case is and also why it's there?
Is there an ideal soft max length of N characters, e.g. 60?
Most tracks use the description to generate the test name, if I understand correctly. Also, it seems like there's no current recommendation for an alternate method for generating names. As such, it is my recommendation that we continue to generate names based on the description, and that we make those descriptions short, allowing for ambiguity.
That said, shorter doesn't really mean more ambiguous most of the time, I think. @bencoman's example earlier, I find roll bonus balls before scoring to be much more understandable than Bonus balls for a strike in the last frame must be rolled before score can be calculated.
Where possible, I think it makes sense to base descriptions on the why if possible, since the how already exists in the implementation, but I don't think we should mandate that with any force, because there are just too many cases where the why is "duh". I suspect that any time we try to make the rules too rigid, we end up with silly edge cases. Since we have people writing these specs, not machines, it seems like a perfectly good time to allow for judgement calls.
I'm not opposed to adding or changing the spec in any way, but at the moment I don't understand how the long descriptions and a new short identifier would be better than short descriptions.
If we do find that we need maintainer notes and student notes, I think it's worth breaking the spec and spending some concerted effort helping those tracks whose generators break when new fields are added. (I would recommend, in that case, leaving "comment" as an empty field until generator maintainers can remove the reliance on it).
I'm going to try and give some thought from a product POV rather than a track maintainer POV (because I don't maintain any tracks and don't know anything about how this part of exercism works).
The most important thing in this is that we consider what is clear to the user's learning. This should be the main guiding principle.
The second most important thing is that maintainers can maintain the tracks easily without lots of extra work. Reducing unnecessary effort maintainers keeps everyone happy and means maintainers can do the most important things.
It's also important to raise that TDD is Exercism's preferred method of solving exercises, so the tests should be clear. I rarely even read the README when solving exercises and instead just attack the test-suite.
My instinct says that there are two clear things here - the name of the test, and some information that explains what's going on with the test. At the moment, these are combined into one field (description
), which has worked fine because we've kept a middle ground.
Wearing my product hat, I would fine with splitting these out explicitly into two fields, with the following conditions: 1) These fields should be explicitly different with their separate purposes clear, I would want them to diverge significantly. e.g.
information
)2) I don't want to arbitrarily cause work for the test-suite maintainers, especially as these people tend to also be doing things like auto-analysis atm so are already thinly spread. I think if this is agreed it should be done as follows:
name
is added, which is populated with the current description.information
is added, which is null
on all tests.names
and add information
fields.name
instead of description
. During that period, new exercises must have description
and name
fields added.description
.@iHiD's proposal would be an excellent outcome for me. I'd tend towards a more explicit studentNotes
over information
but the purpose of the latter might be reasonable specified in the repo README.
I'm am willing to keep on with shortening then existing description
if that ends up the only path forward, but I think @sshine raise a significant point, that...
descriptions will diverge and become long again unless addressed.
...because that is the nature of a "description".
I'm am willing to keep on with shortening then existing description if that ends up the only path forward, but I think @sshine raise a significant point, that...
Well, @sshine also suggested a possible fix, by limiting the number of characters that can be used for the description.
It looks like opinions vary a bit on this subject. Reading this thread, it looks like there are basically two options:
So how to continue? In another thread, it was implied that if no consensus was reached, the Exercism team would make a decision. In this case though, it looks like @kytrinyx and @iHiD have slightly different opinions :) Can we reach a consensus here or can otherwise a decision be made for us? I feel passionate about this issue, but I'm even more passionate about being able to continue and start working or closing this issue :)
In three parts: My personal recommendations, explanation of why I haven't been helping, and some statistics about the current description lengths that may help a decision.
Personal recommendations:
I observe that there are two questions whose answers may be decided on completely separately.
They are separate because the adding of new fields clearly defines a translation path between the old schema and the new schema, so if a description is to be shortened, it can be shortened in the old schema or the new schema and the result will be the same. (Mathematically, the two functions commute with one another)
If there should be a length limit, since we are only human, we will inevitably violate it unless a machine checks it for us. The statistics may help understand whether there can/should be a length limit. If there should be one, code should be written to check it.
If a new field is introduced that is suited to contain long descriptions, please do one of the two:
comments
field.comments
field.Otherwise, I will not be able to understand the difference between comments
vs the new field and I will act unpredictably if I write any future canonical-data.json files.
You may have seen that after I posted my view of why things are the way they are, then I didn't help. It seemed to me that I wouldn't be affected either way by any of the decisions so I had nothing to add. I think that is the natural state of affairs though, so it is not too surprising that we see those unaffected by the decision tend not to make the decision or take action.
The incentives are also not aligned very well because a decision in this issue would benefit the students or the mentors of a track, and there are maintainers that are currently neither students nor mentors. That includes myself. I have historically found it difficult to convince people to do something that doesn't benefit them in some way. But that is a discussion for another issue entirely so I'll get back on topic.
Here is some code that reads all our canonical-data.json files and outputs:
It is capable of concatenating the descriptions from all the nested levels above a case (if you pass the -r
command-line flag), but since I've already argued that that doesn't make sense to include these as part of the test identifier (they are shared between many cases, therefore the nesting/grouping features of the test suite should be used instead), I will not show the results of the -r
flag and will instead just show the result of running it without arguments.
the code
require 'json'
def cases(h)
# top-level won't have a description, but all lower levels will.
# therefore we have to unconditionally flat_map cases on the top level.
# on lower levels we either flat_map cases or just return individuals.
cases = ->(hh, path = [].freeze) {
hh['cases']&.flat_map { |c|
cases[c, (path + [hh['description'].freeze]).freeze]
} || [hh.merge(path: path)]
}
h['cases'].flat_map(&cases)
end
maxes = []
freq = Hash.new(0)
recurse = ARGV.delete('-r')
Dir.glob("#{__dir__}/exercises/*/canonical-data.json") { |f|
exercise = File.basename(File.dirname(f))
lens = cases(JSON.parse(File.read(f))).map { |c|
arbitrary = ?/
((recurse ? c[:path] : []) + [c['description']]).join(arbitrary).size
}
maxes << [lens.max, exercise]
lens.each { |l| freq[l] += 1 }
}
n = ARGV.empty? ? 10 : Integer(ARGV.first)
bucket_size = 5
(0..maxes.map(&:first).max).step(bucket_size) { |left|
right = left + bucket_size
v = freq.values_at(*(left...right).to_a).sum
puts '%3d - %3d: %d' % [left, right, v]
}
puts "total: #{freq.values.sum}"
puts "top #{n}:"
maxes.max_by(n, &:first).each { |x| puts '%3d %s' % x }
the results
0 - 5: 18
5 - 10: 55
10 - 15: 165
15 - 20: 170
20 - 25: 277
25 - 30: 208
30 - 35: 186
35 - 40: 147
40 - 45: 89
45 - 50: 100
50 - 55: 66
55 - 60: 49
60 - 65: 31
65 - 70: 18
70 - 75: 17
75 - 80: 7
80 - 85: 15
85 - 90: 6
90 - 95: 5
95 - 100: 2
100 - 105: 0
105 - 110: 2
110 - 115: 0
115 - 120: 0
120 - 125: 0
125 - 130: 0
130 - 135: 0
135 - 140: 0
140 - 145: 2
total: 1635
top 10:
143 simple-cipher
107 two-bucket
107 bowling
98 poker
90 saddle-points
85 react
84 rational-numbers
84 custom-set
81 raindrops
81 grep
Each test case now has an uuid
field! Therefore, this issue can be closed.
[Edit:] Some languages don't individually identify tests and use the canonical data "description" field to generate comments into their tests code. Other languages individually identify tests, but all that is available for generating a test-name identifiers is the "description" field. This results in some extremely long test-method-names. e.g. For
robot-simulator
Pharo has...test18_WhereRTurnRightLTurnLeftAndAAdvanceTheRobotCanFollowASeriesOfInstructionsAndEndUpWithTheCorrectPositionAndDirectionInstructionsToMoveEastAndNorth
Can an optional
identifier
field be introduced for cases where the description doesn't make a good identifier?For example, for the Bowling exercise...
generating tests for different tracks produces the following...
CSharp: public void Bonus_rolls_for_a_strike_in_the_last_frame_must_be_rolled_before_score_can_be_calculated() https://github.com/exercism/csharp/blob/master/exercises/bowling/BowlingTest.cs#L250
Delphi: procedure BowlingTests.Bonus_rolls_for_a_strike_in_the_last_frame_must_be_rolled_before_score_can_be_calculated; https://github.com/exercism/delphi/blob/master/exercises/bowling/uBowlingTests.pas#L324
FSharp: let ``Both bonus rolls for a strike in the last frame must be rolled before score can be calculated`` () = https://github.com/exercism/fsharp/blob/master/exercises/bowling/BowlingTest.fs#L177
Adding a optional
identifier
like below would be useful.