Open JustinasJas opened 5 months ago
More general problem seems to be possibility to create a mismatch between importer/exporter and actual specification used. Such mismatch can easily be encountered by copying and renaming importers/exporters.
Toolbox provides options to copy and duplicate items in multiple windows. However, copying importer/exporter does not create copy of its specification. Common specification among multiple importers/exporters is not an intuitive outcome of copying and it is easy to miss this fact. Doing so is not as dangerous as in previous described cases where entire specifications are lost with any edit. That said, unaware user may significantly modify original specification losing associated work.
Json file for importer/exporter is created with name of importer/exporter specification at the time this name is entered in importer/exporter editor. However, renaming in importer/exporter specification editor does not affect name of json. When creating new importer/exporter with the name of already existing json file name, Toolbox asks if json should be rewritten even though previous importer/exporter that created that json was renamed (no way to see in toolbox what is the issue). After choosing to rewrite fist json both importers/exporters refer to the same json and modify it entirely (including "name" written inside json). I.e., additional case for issue of potentially losing specifications. Basic case to obtain this issue: (1) want to create exporters named A and B; (2) creating exporter A, but accidentally name it and its specification as B; (3) rename first exporter and its specification in specification editor to A as initially intended; (4) creating exporter B; (5) upon saving its specification as B receive a message that B.json exists and chose to overwrite it. In case of such misnaming case it seems better to delete and re-add exporter than rename it.
Case 1 - specification selection
Problem In project with multiple importers and their specifications select one importer and different specification in "Specification" field of "Importer properties" window. Then open Importer specification editor by any of possible ways: (1) double clicking selected importer, (2) double clicking input file in "available resources" window of "Importer properties" window, or (3) clicking "edit specification" bottom in "Importer properties" window. Editor shows selected specification that no longer has input data and data of selected importer that no longer has specification. This is despite the absence of link between such input data connection and importer pair. Example Specification selection.zip contains project with two data files and two working importers. Project also contains 3rd specification of deleted importer to illustrate addition element that could be mixed up in selection. Project allows reproducing the problem by selection describe before.
Case 2 - importer with multiple specifications
Problem Single importer allows setting specifications for multiple input files, but saves only for one. Setting specifications for the second file deletes specifications for first. In turn, setting specifications for the first file again deletes specifications for the second file. Example Importer with multiple specifications.zip contains project with single importer for Nodes.xlsx and Technology.xlsx. Current mappings for Technology have deleted initial mappings for Node. Note on specifications In this case one moves between specifications by selecting input file in "Available resources" in "Import properties" window, which is not expected. In this project only one specification is present in "Specification" field. However, in presence of other importers (including deleted ones) behavior of importer becomes much harder to understand.
Takeaways
Risk The intuitive interpretation of importer editor window with previously mapped data without any mappings is that these mappings were lost due to missed saving or some corruption and one should redo mapping. Redoing, however, in addition to extra work, would mess up old mappings. The potential progress loss can be quite significant working with large and complex input data files. Mitigation The described combination of selections and multiple files for single importer should not be possible or at least warned somehow.