MathematicalMedicine / diver-issues

Semipublic tracking of issues for the DIVER front end
0 stars 0 forks source link

Pedigree Drawing functionality in Ascertain Pedigrees #137

Open Viqsi opened 1 year ago

Viqsi commented 1 year ago

This is part of the Ascertain Pedigrees UX, and, um, oh my.

Supposedly this is implemented somewhere in Data Explorer already; I'll probably want to take a look-see at that. (Because otherwise I've very little clue on how to begin - they have to be drawn on-the-fly...)

Viqsi commented 1 year ago

Today's Sudden Revelation: this is also a PHI issue and so permissions come into play. ☹️

Viqsi commented 1 year ago

Per email discussion, apparently there's an additional implemented live service that wraps Madeline that Data Explorer calls. It also apparently generates PDFs, but that's only done offline. (Is that a technical constraint or just their approach to doing things? IDK.)

It'll be discused later today on a call.

Viqsi commented 1 year ago

That service is publicly visible at https://www.nimhgenetics.com/pedigree/. There's useful documentation and samples directly available there. Rajiv also pointed out an API test page (at https://www.nimhgenetics.org/pedigree/api-test.html). It accepts what looks very much like standard pedigree files (with some twists), so at least we'd have no worries about who has access to what individuals - we provide the individuals information to the service as part of invocation.

PDFs aren't an also, they're an only, and there's processing time involved, so we'll have to work out how that should operate for the user. Shefali's approach is evidently to email the user when the PDF is completed - not clear if she's attaching the PDF or providing a download link.

Viqsi commented 1 year ago

Some additional bits from my work diary, taken on the fly during the meeting:

2023-04-20 13:56:38-0400
It's a live web service! https://www.nimhgenetics.org/pedigree/

Shefali creates a CSV file on the fly and sends it to that service.

2023-04-20 14:01:35-0400
CORS might be an issue. And it takes some time for the file to emerge; one has to poll for it.

2023-04-20 14:02:19-0400
PDFs only. Madeline provides SVGs, and they stitch them into PDFs with additional context and guides for the user.
Viqsi commented 1 year ago

@WValenti provided an example of how to pull the proper info in #135:

It works now, and is pushed. Using the example given above, you can extract and draw pedigrees using:

  1. mysql (ada.mathmed.org DIVER)>CALL API_AscertainCohort('testuser', 16, 'testPCa', 'Test of PC', 644, 'i10920', '1', 2, '>=', 4, @oncid); Select @oncid;
  2. mysql --defaults-extra-file=~/DATABASE-TO-USE.cnf DIVER -AB --execute="Select fam_id, ind_id, IFNULL(father_id,0) father_id, IFNULL(mother_id,0) mother_id, sex, ELT(IFNULL(value,2)+1,'009900','990000','999999') value from pedigreeCohortInds where cohortId = 674;" >test.pre
  3. cat test.conf PedigreeFile test.pre PedigreeName Cohort_Pedigrees Delimiter ws SubgraphVariable fam_id NameVariable ind_id FatherVariable father_id MotherVariable mother_id GenderVariable sex ColorVariable value TextVariable ind_id PageOrientation landscape
  4. cranefoot test.conf
  5. ps2pdf Cohort_Pedigrees.ps Cohort_Pedigrees.pdf
  6. open Cohort_Pedigrees.pdf

This is using Cranefoot, though, and the live service we were looking at using wraps Madeline.

WValenti commented 1 year ago

So this process works to generate (by hand for now) a file that can go to the NRGR service (madeline2 format):

  1. mysql --defaults-extra-file=~/DATABASE-TO-USE.cnf DIVER -NAB --execute="Select fam_id, ind_id, IFNULL(father_id,0) father_id, IFNULL(mother_id,0) mother_id, FIELD(sex,'M','F') sex, IFNULL(value,-1)+1 value from pedigreeCohortInds where cohortId = 674;" >test.pre
  2. cat test.locus A trait
  3. prepmad test.pre test.locus test # prepmad is a link to prepcrane in mmpipescript ...and you've got a test.mad file to upload.

To see local results from madeline2, just: madeline2 -L "IndividualId" test.mad

The test.mad file looks like: cat test.mad FamilyId IndividualId Father Mother Gender Affected trait

83-03070 70-00250 70-00253 70-00252 F a 2
83-03070 70-00251 70-00253 70-00252 F a 2
83-03070 70-00252 . . F . 0
83-03070 70-00253 . . M . 0
...

The rest of the parameters that the service wants need to be teased-out, but you'll at least get something to work with using this file.

WValenti commented 1 year ago

Once we finalize a format, we can explicitly write the query to produce most, if not all, of what we want in the .mad file.

Forgot to mention that at the moment, the query is designed to produce a LINKAGE file because that's the input that prepcrane and prepmad expect. The locus file will also be unneeded when we write the .mad file directly, because it, too, is a LINKAGE thing needed by prep*.