Closed ssarrafan closed 3 years ago
Testing feedback from today is noted in the testing worksheet and there is a Google doc with screenshots and other details.
Thanks, @ssarrafan, I'm working on these issues now, except for two:
Two Gold IDs for all samples in Brodie study (already reported on Slack)
This is a metadata problem, so I think it needs to be corrected in Mongo. Is that right?
Can you please add a PI image on the study page for the new Spruce study. Photo on page 3 of Google doc.
Profile image is part of the NMDC schema: PersonValue.profile_image_url
-- our server doesn't host images. This image will need to be added to the principal_investigator
object of the study.
Thanks, @ssarrafan, I'm working on these issues now, except for two:
Two Gold IDs for all samples in Brodie study (already reported on Slack)
This is a metadata problem, so I think it needs to be corrected in Mongo. Is that right?
Can you please add a PI image on the study page for the new Spruce study. Photo on page 3 of Google doc.
Profile image is part of the NMDC schema:
PersonValue.profile_image_url
-- our server doesn't host images. This image will need to be added to theprincipal_investigator
object of the study.
@dwinston or @dehays can one of you please add Dr. Schadt's photo from the third page of this Google doc to NERSC or wherever the PI images need to live?
I created a Chris Schadt image file and added it to cori.nersc.gov:/global/cfs/cdirs/m3408/www/profile_images
I'll use a change sheet to update the PI image url value on the SPRUCE study.
Mongo has been updated with Chris Schadt profile image url as PI of SPRUCE study. This should be visible after the next portal ingest.
Mongo has been updated with Chris Schadt profile image url as PI of SPRUCE study. This should be visible after the next portal ingest.
Thank you @dehays
The rest of the items in the list should have been addressed. Please LMK if any other changes are needed.
@dwinston please see GH issue https://github.com/microbiomedata/nmdc-runtime/issues/40
not sure if this is an artifact of other issues, but the links out to the NCBI/EBI biosample doesn't look correct for at least a few samples i checked. For example, for _Riverbed sediment microbial communities from areas with no vegetation in Columbia River, Washington, USA - GW-RW N1_1020, the Biosample accession should be: SAMN06267121 but currently points to a fish sample. The IMG and GOLD links seem ok.
5 Brodie samples have a different ENVO classification compared to the rest of Brodie's samples.
Compared to the rest of Brodie's samples, which are classified as: Which I can verify in GOLD (these 5 samples don't exist in GOLD and must be EMSL only samples). From the latest spreadsheet, all Brodie samples have the same GOLD classification - so I would presume the ENVO terms should be identical too... yet it's unclear to me which ENVO classification is correct, or if these 5 samples should just be removed altogether (I thought I remembered we weren't showing EMSL only samples but I could be wrong).
meta: I can usually tell when something needs my attention, but please tag me by name if there's something specifically for me to do.
@subdavis Here's the document with the screenshots and issues with downloading that Karen reported. I tried the first file that she tried (1781_100351.filtered.fastq.gz.download) and it's still trying to download and appears to be stuck. I20211012_ssues_with_downloading_from_portal_kd.docx
I tried the same thing in production instance and it's much faster.
@subdavis here's a screenshot that shows the difference in how fast it's downloading... the top file is from production and started later and the third one down is from dev and started sooner.
Thank you for the detailed report. I can reproduce this now.
2021/10/13 01:28:59 [error] 11#11: *113 upstream prematurely closed connection while reading upstream, client: 128.55.212.127, server: localhost, request: "GET /data/1781_100351/qa/1781_100351.filtered.fastq.gz HTTP/1.1", upstream: "http://10.42.8.144:8080/1781_100351/qa/1781_100351.filtered.fastq.gz", host: "data.microbiomedata.org", referrer: "https://data.dev.microbiomedata.org/"
From the data container, I see a 206 return code.
128.55.206.110 - - [13/Oct/2021:01:30:58 +0000] "GET /1781_100351/qa/1781_100351.filtered.fastq.gz HTTP/1.0" 206 871207475 "https://data.microbiomedata.org/?q=ChQIABACGAIiDCJNZXRhZ2Vub21lIg==" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.58 Safari/537.36" "216.63.191.253"
I can see that each request failes (for me) after exactly 1 GB has been downloaded. If the download is resumed, it runs until exactly 2GB is downloaded.
I suspect a proxy configuration issue, I'm going to recruit some help from another member of the team tomorrow.
Thank you for the detailed report. I can reproduce this now.
2021/10/13 01:28:59 [error] 11#11: *113 upstream prematurely closed connection while reading upstream, client: 128.55.212.127, server: localhost, request: "GET /data/1781_100351/qa/1781_100351.filtered.fastq.gz HTTP/1.1", upstream: "http://10.42.8.144:8080/1781_100351/qa/1781_100351.filtered.fastq.gz", host: "data.microbiomedata.org", referrer: "https://data.dev.microbiomedata.org/"
From the data container, I see a 206 return code.
128.55.206.110 - - [13/Oct/2021:01:30:58 +0000] "GET /1781_100351/qa/1781_100351.filtered.fastq.gz HTTP/1.0" 206 871207475 "https://data.microbiomedata.org/?q=ChQIABACGAIiDCJNZXRhZ2Vub21lIg==" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.58 Safari/537.36" "216.63.191.253"
I can see that each request failes (for me) after exactly 1 GB has been downloaded. If the download is resumed, it runs until exactly 2GB is downloaded.
I suspect a proxy configuration issue, I'm going to recruit some help from another member of the team tomorrow.
I just tried to same download from Karen's first example in production and it's working fine and downloaded quickly so the good news is no problems with downloads in production. The issue is only on dev.
If you tried it within the last 10 minutes, that's because I was testing a fix and it appears to be working :)
Brandon is now helping me get it deployed in a permanent fashion, should be done soon.
Closing this as October release is in prod now
This issue is to document and track testing feedback and fixes in one place