mdsol / rwslib

Provide a (programmer) friendly client library to Rave Web Services (RWS).
MIT License
31 stars 13 forks source link

studyEvents and associated CRF per subject #128

Closed vagarwal77 closed 2 years ago

vagarwal77 commented 2 years ago

I need to have below requirements -

For a given subject, i need to get list of visits (studyEvents) and associated CRFs per visit.

I had looked into RWSLib and have not found any thing much .. i can see VersionFoldersRequest but it is neither specific to a subject and nor it has CRF information.

Please suggest as i was thinking it may be a very common use case.

isparks commented 2 years ago

You can get an ODM-file per subject. This is a tree of all visits, forms, form data etc.

https://rwslib.readthedocs.io/en/latest/retrieve_clinical_data.html#subjectdatasetrequest-project-name-environment-name-subjectkey

johnsynnott commented 2 years ago

Hi isparks, thanks a bunch! This is perfect for getting the clinical data of a specific user.

On top of that, would you happen to know if it's possible to get which events and forms could be filled out for a given subject? In essence what we're after is the set of StudyEventDefs and FormDefs (FormRefs would be nice too!) that a given subject could potentially fill out while still being valid.

Thanks again!

glow-mdsol commented 2 years ago

You'll need to combine the core metadata service (which gives the default matrix) with one of the ODM Adaptor datasets like VersionFolders; you need to stitch it together from there. There's a VersionFoldersWithForms dataset which is much more extensive and I've used that.

johnsynnott commented 2 years ago

Hi Glow, thanks for the direction - unfortunately I think I'm still a bit stuck. A couple more questions:

  1. By core metadata service do you mean requests from the rwslib.rws_requests package?
  2. Is there any extra info on the VersionFoldersWithForms request? I can't find it in the docs or in the code from rwslib on pypi.

It seems like none of the requests (except the clinical data requests) require a subject to be specified, am I barking up the wrong tree on figuring out which events and forms a subject might be eligible for given the previous forms that have been filled out for them? For example, VersionFolders is perfect for getting the set of StudyEventRefs, but then which request(s) would you make to determine which of those study events a given subject could actually be part of?

isparks commented 2 years ago

HI @johnsynnott. You originally asked:

..would you happen to know if it's possible to get which events and forms could be filled out for a given subject

This isn't easy. You can find out a core of visits/forms (the Rave Primary Matrix) and you can get the definition of all the possible forms but knowing what's expected for a subject is going to be data-driven. Rave has edit check actions like MergeMatrix and AddForm which can add (or remove) Visit/Form combinations dynamically based on user input. Similarly, questions can be made visible/invisible based on input. In addition, Rave has a subject admin feature which allows users to manually add visits (from the set of possible visits) and forms to those visits. If I recall correctly users can even add forms which don't normally belong in those visits - so you could get a Demography form turn up in the Adverse Event folder if an admin user felt like it - not likely, but possible.

isparks commented 2 years ago

Also @johnsynnott the VersionFoldersWithForms looks like it was introduced in Rave 2017.X but not (yet) added to rwslib.

GET https://{host}/ravewebservices/datasets/VersionFoldersWithForms.odm?studyoid={study-oid}&metadataversionoid={versionID}

Example:

GET https://my-organisation.mdsol.com/RaveWebServices/datasets/VersionFolders.odm?studyoid=Mediflex(Prod)

Documented here: https://learn.mdsol.com/api/rws/retrieve-metadata-with-the-version-folders-with-forms-dataset-232285921.html

From the above page the output would be:

<?xml version="1.0" encoding="utf-8"?>
<ODM  >
  <Study OID="{study-oid}">
    <GlobalVariables>
      <StudyName>{study-oid}</StudyName>
      <StudyDescription />
      <ProtocolName>{study-id}</ProtocolName>
    </GlobalVariables>
    <MetaDataVersion OID="{metadata-oid}" Name="{version-name}" mdsol:PrimaryFormOID="{primary-form}">
      <Protocol>
        <!-- StudyEventRef elements - for each Folder in and outside of the base matrix -->
      </Protocol>
      <!-- StudyEventDef elements - for each Folder in and outside of the base matrix -->
        <!-- FormRef elements - for each Form in the folder -->
      </StudyEventDef> 
    </MetaDataVersion>
   </Study>
    <!-- More MetaDataVersion elements - for the specified Version "in-use" -->
</ODM>
glow-mdsol commented 2 years ago

You can use something like:

    def get_version_folders_with_forms(self, project, metadata_version,
                                       environment="Prod", ignore_matrices=None) -> VersionFoldersWithFormsWrapper:
        """
        Get the VersionFolders (used to mock out the Matrix)
        :param str project: Project Name
        :param str metadata_version: CRF Version
        :param str environment: Environment Name
        """
        study_env_oid = f"{project}({environment})"
        results = self.client.send_request(
            ConfigurableDatasetRequest(
                dataset_name="VersionFoldersWithForms",
                dataset_format="odm",
                params=dict(
                    StudyOid=study_env_oid, MetadataVersionOID=metadata_version
                ),
            )
        )
        if self._client.last_result.status_code != 200:
            response = self._client.last_result
            logger.error(
                f"Error accessing {response.url} -> {response.status_code}: {response.content}"
            )
            raise ClientError(
                f"Unable to get version folders "
                f"with forms for {project} ({environment}) - {metadata_version}"
            )
        version_folders = VersionFoldersWithFormsWrapper.from_content(results,
                                                                      filter_matrices=ignore_matrices)
        return version_folders

where the VersionFoldersWithFormsWrapper is something like:

class VersionFoldersWithFormsWrapper:

    def __init__(self, folders, matrices):
        """
        Create a wrapper
        :param dict[str, StudyEventWrapper] folders: Folders
        :param dict[str, Matrix] matrices: Matrices
        """
        self._folders = folders
        self._matrices = matrices

    def get_matrix(self, matrix_oid):
        return self._matrices.get(matrix_oid)

    def get_folder(self, folder_oid):
        return self._folders.get(folder_oid)

    @property
    def folders(self):
        return [x for x in sorted(self._folders.values(), key=lambda x: x.order_number)]

    @classmethod
    def from_content(cls, result, filter_matrices=None) -> 'VersionFoldersWithFormsWrapper':
        _filter = filter_matrices if filter_matrices else []
        m = ET.fromstring(result.encode())
        ns = m.nsmap
        ns.update(dict(odm=ns.pop(None)))
        folders = {}
        matrices = {}
        for mdx in m.findall(".//odm:MetaDataVersion", namespaces=ns):
            is_default = False
            matrix_oid = mdx.get(mdsol("MatrixOID"), "SUBJECT")
            if matrix_oid in _filter:
                # skip matrix by name
                continue
            # print(f"Processing Matrix {matrix_oid}")
            for ser in mdx.findall("./odm:Protocol/odm:StudyEventRef", namespaces=ns):
                study_event_ref = ser.get("StudyEventOID")
                study_event = folders.get(study_event_ref)  # type: StudyEventWrapper
                if not study_event:
                    # create a new folder
                    study_event = StudyEventWrapper(
                        oid=study_event_ref,
                        order_number=ser.get("OrderNumber"),
                        name=ser.get(mdsol("StudyEventDefName")),
                        repeating=ser.get(mdsol("StudyEventDefRepeating")),
                        type=ser.get(mdsol("StudyEventDefType"))
                    )
                    folders[study_event_ref] = study_event
                study_event_def = mdx.find("./odm:StudyEventDef", namespaces=ns)
                form_ref = study_event_def.find("./odm:FormRef", namespaces=ns)
                study_event.add_form(form_ref.get('OrderNumber'),
                                     form_ref.get('FormOID'),
                                     form_ref.get('Mandatory') == "Yes",
                                     matrix_oid)
                study_event.add_to_matrix(matrix_oid)
                matrix = matrices.setdefault(matrix_oid, Matrix(matrix_oid, is_default))
                matrix.add_folder(study_event)
        return cls(folders, matrices)
vagarwal77 commented 2 years ago

Thanks @isparks @glow-mdsol.

As you have mentioned, I am trying to get access of list of Rave Web Services end points.. but don't have to have access the same.

Documented here: https://learn.mdsol.com/api/rws/retrieve-metadata-with-the-version-folders-with-forms-dataset-232285921.html

I have applied for the access but in meanwhile, is there a way this document available some where else also? Google was not very helpful here :)

glow-mdsol commented 2 years ago

Datasets are not always documented in the same way - they can be added via a external process and may or may not be documented fully. The core end points are referenced in the location you mention and there's kind of a grab bag for some of the CDS

glow-mdsol commented 2 years ago

Ok to close this?