Closed lognaturel closed 2 years ago
I'm planning to do a pyxform
release first thing tomorrow (GMT-7) and would like to include this if possible. @lindsay-stevens would be fantastic if you have time for a quick review. If you don't have time or think it's riskier than I believe, let me know and we can release without!
Sadly I don't understand why Windows tests are failing.
ERROR: Should find that XLSForm conversion produces itemsets.csv from external_choices.
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\a\pyxform\pyxform\tests\test_external_instances_for_selects.py", line 418, in test_itemset_csv_generated_from_external_choices
self.assertEqual('"suburb","Footscray","vic","melbourne"\n', rows[-1])
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\contextlib.py", line 119, in __exit__
next(self.gen)
File "D:\a\pyxform\pyxform\tests\utils.py", line 74, in get_temp_dir
shutil.rmtree(temp_dir)
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 516, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 400, in _rmtree_unsafe
onerror(os.unlink, fullname, sys.exc_info())
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 398, in _rmtree_unsafe
os.unlink(fullname)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmp9ms014r1\\select_one_external.xlsx'
Sadly I don't understand why Windows tests are failing.
I haven't tested but it could be that the workbook is still open, as mentioned here: https://openpyxl.readthedocs.io/en/latest/optimized.html
it could be that the workbook is still open
Thanks so much for helping me read the docs. That seemed like a likely culprit but either I'm closing in the wrong spot or something else is going on. Unfortunately I'm out of time for this tonight.
Having chatted with @yanokwa and Team Central, I think this fix should go in the pyxform release so I haven't released yet. We don't know how the extra columns get generated and it seems like the kind of thing that others would run into so best to patch it now. If you have a chance to review during your day, great. Ideas on the open files issue much appreciated! I'll come back to it tomorrow.
I came up with a workaround for the Windows test failures, PR here to your branch. Otherwise, this PR seems good to go.
Nice working with you on this, @lindsay-stevens 🤝
From @lindsay-stevens on the PR to my fork here:
Just a note for posterity about my guess for antivirus. Main clue was the Windows "Resource Monitor" tab for Disk activity. This activity list shows active file paths and the PID and image name doing read/write on them. When running the tests, the file was being read by a process with image name "System" and description "NT Kernel & System". Not sure what else the system would be doing with files except to scan them - but whatever it's doing we can't exactly terminate the system process 💥
The same thing happened when I changed the temp root to pyxform/tests/test_output so it doesn't seem to be related to using the system/user temporary directory.
Closes #595
Why is this the best possible solution? Were any other approaches considered?
Even though #595 is not critical, this seems like a good change. We don't ever modify an Excel file so there's no point in opening it for writing.
There are slight differences between the writeable and read-only workbook representations I had to handle. The biggest decision I made was not to worry about the difference in dealing with trailing empty rows. That is, if there are empty rows in between some rows with contents, the two representations are the same. If there are empty rows after which there are no more rows with content, the read-only representation includes those empty rows whereas the writeable one does not. One of the equivalency tests between XLS and XLSX picked up on that difference. I tried dropping all empty rows but this would change the output for a rogue
table-list
feature that uses row number to name nodes (!). I verified that empty rows don't affect the output and decided to change the test XLSX instead of changing the code. That seems like the less risky option. I changed the XLSX and XLS files the same way for thegroup
andspecify_other
forms. I deleted the rows with formatting but no contents.What are the regression risks?
There could be some side effects to the different handling of empty rows and columns. I am not thinking of any but that would be good to get a second opinion on.
Before submitting this PR, please make sure you have:
tests
nosetests
and verified all tests passblack pyxform tests
to format code