PDCMFinder / pdxfinder

PDX Finder performs integration, standardization, analysis and visualization of the complex and diverse data associated with PDX mouse models for the cancer community.
https://www.pdxfinder.org/
Apache License 2.0
6 stars 6 forks source link

Clean-up data values for the release #468

Closed zperova closed 3 years ago

zperova commented 3 years ago

Description

As a developer, I want to clean-up all the errors identified by the validator tool.

Acceptance criteria

zperova commented 3 years ago

This ticket needs to be broken up once the validator has been run on all datasets.

zperova commented 3 years ago

Results of the validator are here: https://app.zenhub.com/workspaces/pdxfinder-board-5f0f036188f27e0014a7bdfc/issues/pdxfinder/pdxfinder/497

zperova commented 3 years ago

Here are the suspects. Error count/provider. The count is subject to change: ? provider [TRACE] @Afollette Analysis done. Errors reported in validator but data seems to be ok. 1 provider [UMCG] @mauroz77 Analysis done. Errors reported in validator but data seems to be ok. 2 provider [NKI] @Afollet Analysis done. False positives in validator are reported 3 provider [Curie-BC] @mauroz77 Analysis done. Errors reported in validator but data seems to be ok. Fixed publications value separated with ; 4 provider [Curie-LC] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 5 provider [Curie-OC] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 5 provider [IRCC-GC] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 6 provider [IRCC-CRC] @Afollet Anaylsis done. Fixed data. There are some false positives in validator. 6 provider [UOM-BC] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 6 provider [VHIO-BC] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 7 provider [LIH] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 7 provider [PMLB] @Afollet Analysis done. Fixed data. There are some false positives in validator 7 provider [UOC-BC] @Afollet Analysis done. Fixed data. There are some false positives in validator. 7 provider [VHIO-CRC] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 9 provider [HCI-BCM] @mauroz77 Analysis done. Fixed data. There are some false positives in validator. 10 provider [VHIO-PC] @mauroz77 done 12 provider [PDMR] @Afollet ( I am going to pass on this, modifications will be erased when we pull new data from the Oracle database. It would be better to fix it at the PDX-Transformer than the template level ) 12 provider [SJCRH] @mauroz77 done 13 provider [CRL] @mauroz77 done 14 provider [PPTC] @Afollet ( waiting on correspondence ) 14 provider [WUSTL]@mauroz77 done 17 provider [MDAnderson]@mauroz77 done 18 provider [DFCI-CPDM]@mauroz77 waiting provider's response

mauroz77 commented 3 years ago

Things to check:

Afollet commented 3 years ago

Thanks @mauroz77 I have made the same observations. I appears that the jtablesaw library is reading in the table and replacing the numeric types with blanks. If you use the "TYPE" function in excel you can see the cell value types are different. I'mma going open an issue on the jtablesaw repo about this.

zperova commented 3 years ago

Moving to the next sprint. Large job.

Afollet commented 3 years ago

I have resolved all the tumour_type changes besides for Jax, DFCI-CPDM Everything that has "not specified" as a tumour_type

I changed all the mappings to correspond with these tumour_type changes.

zperova commented 3 years ago

@Afollet I take it as I can start mapping now?

Afollet commented 3 years ago

@zperova I checked the mappings yesterday. There was not anything new. Is there suppose to be new mappings?

zperova commented 3 years ago

I thought there will be about 30 mappings to do @CsabaHalmagyi was talking about - possibly from JAX

Afollet commented 3 years ago

I have a few changes on my local machine that I will push when gitlab is running.

There are correspondence between PPTC and DFCI-CPDM ( as @mauroz77 said ), but the rest of the work has been done.

Report here: https://docs.google.com/document/d/1bPl1xfT5jylv0ntONCcfoa1MLOdsZHKzXcO1cjCauVA/edit#heading=h.t0vdk2nut12d

Afollet commented 3 years ago

@zperova I resolved all the mappings I changed yesterday. There is nothing for you to wait on here.

zperova commented 3 years ago

Great, thanks @Afollet and @mauroz77 Report is very helpful! Closing.