renatoathaydes / prechelt-phone-number-encoding

Comparison between Java and Common Lisp solutions to a phone-encoding problem described by Prechelt
30 stars 12 forks source link

Stronger tests would be useful to verify correctness of different implementations #17

Open ssvb opened 9 months ago

ssvb commented 9 months ago
  1. Assuming the German dictionary, here's a sample input for validating the count mode:
    88888888888888888888888888888888888888888888888888

The expected output:

2342917122930
  1. Assuming the German dictionary, here's a sample input for validating the print mode:
05-2706438/830160--236690-2719//3706425909
88888888888888888888888888888888888888888888888999
0771----303573/////7683///-//719348230271939030499

The expected output:

05-2706438/830160--236690-2719//3706425909: 0 Arbeitslosenversicherungsbeitra"ge 9
0771----303573/////7683///-//719348230271939030499: 0 Bund es aus Bildungsfo"rderung 3 Gesetz 9
0771----303573/////7683///-//719348230271939030499: 0 Bund es mu"d Bildungsfo"rderung 3 Gesetz 9
0771----303573/////7683///-//719348230271939030499: 0 Bund es Ausbildungsfo"rderung 3 Gesetz 9
0771----303573/////7683///-//719348230271939030499: 0 kund es aus Bildungsfo"rderung 3 Gesetz 9
0771----303573/////7683///-//719348230271939030499: 0 kund es mu"d Bildungsfo"rderung 3 Gesetz 9
0771----303573/////7683///-//719348230271939030499: 0 kund es Ausbildungsfo"rderung 3 Gesetz 9
0771----303573/////7683///-//719348230271939030499: 0 Bundesausbildungsfo"rderungsgesetz 9

The tests listed above are valid according to the rules of https://flownet.com/ron/papers/lisp-java/instructions.html

And if we move from the German dictionary to an arbitrary dictionary up to 75000 words (as required by the rules), then I suspect that the following generator constructs something that is close to the absolutely worst possible case conforming to the rules of the study: https://github.com/renatoathaydes/prechelt-phone-number-encoding/pull/16#issuecomment-1892704758

ssvb commented 9 months ago

If the tests are too harsh, then simply shortening the length of the 888..888 pattern makes them much less demanding.