Open ASL-rmarshall opened 7 months ago
To validate a USDM study definition contained in a single JSON file, the JSON file is converted to represent each USDM class as a separate dataset, with each class instance represented as a row in the dataset. For validation purposes, the JSON file therefore contains multiple "datasets". The rules engine currently has some implicit assumptions that each dataset to be validated will be contained in a separate file (e.g., dataset_name
is frequently expected to contain the file name, in particular when dataset information is cached).
To get the CLI validation working as an interim solution, a unique "proxy file name" was generated and assigned for each dataset contained in the single JSON file (see #631), so that this "proxy file name" could be used as the dataset_name
by the rules engine. The "proxy file name" was generated by appending the USDM class name to the original single JSON file name and adding a ".json" extension. For example, the "proxy file name" for the "Study" class dataset in a single JSON file called "/test/data/study_def.json" would be "/test/data/study_def.json/Study.json". This format was chosen to give a unique value for each USDM class dataset, and because the ".json" extension may be used to select the data or metadata readers.
It would be better for the engine to be updated to support single files containing multiple datasets without having to create "proxy file names".
(Regarding additional changes to support cli use for USDM validation, #631) There is one item I don't quite agree with:
It seems wrong to pretend that a single json file is multiple json files at the Engine level. I think it would be better to fix the engine to be able to handle different types of dataframe collections (folder of files, single file, cosmos collection of items, etc). But this might be a much bigger change, in which case the hack is okay for now and you can create a new ticket for it instead.
Originally posted by @gerrycampion in https://github.com/cdisc-org/cdisc-rules-engine/pull/631#pullrequestreview-1977457315