Closed plocket closed 2 years ago
[The pain of having to repeat the same values for multiple tests might be] dealt with using a cucumber 'Background'.
[Edit: "Background" might be a good feature to add in general, but the internal feedback is that we want to be able to run randomized tests anyway.]
Notes from a standup discussion (08/06/21):
Just FYI, some folks have big questions about the use of randomized tests and whether they're really an appropriate tool.
The main non-technical challenge I see here is giving enough useful output to the user. Will need a lot of feedback on that.
Current MVP ideas (not much research so far):
type
of the field.Getting around tricky situations: We already have interviews where there are some tricky inputs that these kinds of tests may not be able to overcome, especially at MVP. To make this useable, and therefore user-testable, from the start, we need ways to get through these situations. Ideas:
Questions:
As @rpigneri-vol named them, these random tests are our "spellcheck" option - they aren't a proper testing suite and shouldn't be treated as such, but they may be better than nothing.
New home of "faker": https://www.npmjs.com/package/community-faker
Question was raised:
[Because the system is randomized, not deterministic (?)], can we somehow create a seed for a test so that it's reproducible?
I haven't ever looked into how to do that. Is that possible? Do we need it? [Maybe an answer: I don't think we do need this. From what I'm understanding, this refers to the interview, not to the test, and the interview is deterministic. At least, all interviews that I've ever worked with have been deterministic.]
Output thoughts:
In the report, only list the "name" of the test (maybe just sequential numbers) and the order of the screens (page id and/or title?). Each test then has a ~file or~ folder (with a matching name) in the downloadable artifacts containing:
A file with:
#trigger
element, that may not be possible. Unless... could we change proxy var handling so that each individual var row as a limit for how many times it can be used? E.g. the trigger variable could contain a number instead of a variable name. We could think of using that as a technique in general. Should this be in its own .feature
file? Should it be in both places? Should this be included in the report as well? That would make it easily accessible. It would be a lot of info, though. Should the report only include the reproducible test text? That sounds confusing to read as the only output. [If it's a story table, should it be in alphabetical order?]Other files in the folder:
Maybe the name of the folder would also contain "failed" if it failed. Or maybe there'd be a "failed" folder for all failed tests? I don't love nested folders, though.
So, first brainstorm for what random input output might look like in the downloaded artifacts folder, including folders (folder 1 is open):
report.txt
"""
Some ALKiln title with date, time, and ALKiln version
====================
Failed tests
====================
failed_random_input_tests_1 question ids and titles
accept-terms: "Do you accept the terms?"
name-question: "What is your name?"
contact-info: "What is your address?"
lawyer-name: "Do you have a lawyer?" (infinite loop)
final target variable: We couldn't find this info in the page. See our-docs.com.
--------------------
failed_random_input_tests_2 question ids and titles
<etc>
"""
> failed_random_input_tests_1
<some doc name>.pdf
failure_screenshot.png
failed_random_input_tests_1_report.txt
"""
Some ALKiln title with date, time, and ALKiln version
final target variable: We couldn't find the target variable of the page. See our-docs.com.
--- Test (copy into your own file in "Sources" folder. More instructions?) ---
Feature: Replace with description
Scenario: Replace with description
Given I start the interview at "a-legal-form.yml"
And I get to the question id "replace with your target question id" with this data:
| var | value | trigger |
| accept_terms | True | |
| user.name.first | Reina | |
| user.name.last | Gonzalez | |
| user.address.address | 342 Main St. | |
| user.address.city | Boise | |
| user.address.state | Idaho | |
| user.phone_number | 555-555-5555 | |
| has_lawyer | False | |
--- End of test ---
failed_random_input_tests_1 question ids and titles
accept-terms: "Do you accept the terms?"
name-question: "What is your name?"
contact-info: "What is your address?"
lawyer-name: "Do you have a lawyer?" (infinite loop)
failed_random_input_tests_1 question ids and titles
accept-terms: "Do you accept the terms?"
check the accept_terms checkbox
Continued
name-question: "What is your name?"
user.name.first was set to "Reina"
user.name.last was set to "Gonzalez"
Continued
contact-info: "What is your address?"
user.address.address was set to "342 Main St."
user.address.city was set to "Boise"
user.address.state was set to "Idaho"
user.phone_number was set to "Lorem ipsum"
Tried to continue
invalid answer for user.phone_number: "This answer needs to be a phone number"
user.phone_number was set to "555-555-5555"
Continued
lawyer-name: "Do you have a lawyer?" (infinite loop)
checked the has_lawyer checkbox
Continued, but saw the same page
target variable: We couldn't find the target variable of the page. See our-docs.com.
JSON variable values on the final page:
{
...
}
"""
> failed_random_input_tests_2
> passed_random_input_tests_1
I'm using indentation to denote contents of the folder or file.
Not sure what to call an "infinite loop" question. I don't think anyone else uses the name "infinite loop" to describe questions where you press continue and you just keep getting the same question over and over again.
Infinite loops: we may only catch single-page infinite loops and probably not all of those either. Will try to write a whole comment about that later.
Edit: Maybe this:
| has_lawyer | False | |
...needs to be replaced with interactive Steps. And I set the variable "has_lawyer" to "False"
and And I tap to continue
and Then I got to the next page
(or whatever that last one is). That way we'd be able to put the id in for the has_lawyer
page and replicate the test completely.
Can we add to the report a list of the possibly hidden fields? Or just the fields that were on the screen and the values that they had or did not have?
Use small values for integers (0-10) so you don't test with 99 children. Maybe similar way to answer "no" after a couple screens where you are asked "is there another".
General: if I've seen this screen 5 times, pick a different button this time.
Deep dive discussion: creating the feature file as a separate file might be useful! A failed random test is a good candidate for a test you always run the same way.
where to put error screenshots for easy access? And can we give more useful info in filenames now? From #429 discussion about artifacts structure.
Maybe the error screenshots should be in both places. This is the idea proposed here, though the names are missing a bunch of pieces to avoid a mess and show the arrangement more clearly.
- report.txt (whole report)
- error-3pm-scenario 1 description-endingPgID.png
- error-4pm-scenario 2 description-endingPgID.png
- 3pm-endingPgID-scenario 1 description (folder)
- report-endingPgID.txt (scenario only)
- download-3pm-file-1-pgID.pdf
- error-3pm-pgID.png (Same pic as "error-3pm-scenario 1 description.png". Same name seems dumb because it would have the scenario name in it too)
- json-3pm-pgID.json
- json-3pm-pgID.json
- scenario 1 description.feature
- screenshot-3pm-pgID.png
- 4pm-endingPgID-scenario 2 description (folder)
- report.txt (scenario only)
- error-4pm-pgID.png (Same as scenario 1 error pic name rationale)
- scenario 2 description.feature
"3pm" and "4pm" just indicates a timestamp. "PgID" could be question id or screen trigger var, whichever is available, or neither if none available. Not sure which is preferred.
Everything would be in one artifact folder.
A challenge we'll have with creating the story table output for users:
Every field name representing a variable needs to be base64 decoded. Currently that means we have multiple guesses for what a field name might be. That would add a lot of nonsense rows to the table which would be duplicates of each other.
Proposals to reduce this problem:
Detect non-valid var name characters (that are often present when some text has been overly decoded) and remove those from the table. We can probably also remove those from field name guesses as a bonus.
This is an alternative to having the developer write out every scenario to cover all their code. It's not ideal, but since you can't abstract in cucumber, writing every single scenario can be a huge task. It is possible, in cucumber, to allow the user to pass in data structures like lists. We would just have to handle randomly selecting them.
Note: This is not a fault in cucumber - it's not meant to be used the way we're using it.
Also need to think whether the developer will need to copy/paste this 'scenario' for however many times they want the random tests to be run, or if we can run them repeatedly somehow. This might be better in its own issue. [Edit: This is probably doable now that we have the knowledge of setting, and resetting, our own custom timeouts.]