Closed twang15 closed 2 years ago
Hi Annika,
Sorry, I cannot remember anything about this submission. But it has been tracked in github and I will start the submission:
https://github.com/StanfordBioinformatics/pulsar_lims/issues/811
Best, Tao
Hi Ingrid and Annika,
This ChIP experiment https://www.encodeproject.org/experiments/ENCSR236JZN/ was released on August 3rd, 2020. However, I was able to add 2 more replicates (rep 3 and rep 4) but failed to add the fastq files.
The questions are:
Annika, in this submission sheet https://docs.google.com/spreadsheets/d/1Hj5Al2J_F0sUUolG-dUQZqpEFPEl-YUm/edit#gid=795058038, you asked the question in row 5 for rep3/rep4 “why was this experiment repeated? It's released with minor audit?”, do you figure out the answer or do we need to submit rep 3 and rep4?
Best, Tao
Missing biosample characterization:
IP posted for both biosamples
Hi Annika,
We have new sequencing data from SREQ-430 for cs-152 (ENCSR774PVT). But its upstream experiment has been removed. Do we still want to submit the new data?
Thanks, Tao
Hi Ingrid and Jennifer,
Is the ENCODE server under maintenance? I have been trying to submit a ChIP experiment: https://www.encodeproject.org/experiments/ENCSR548FLW/, but constantly see this error:
2022-02-28 09:53:54,440:ppy_debug: GET Biosample record with ID 13848: https://pulsar-encode.herokuapp.com/api/biosamples/13848 ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Thanks, Tao
Hi Tao,
I haven't been able to find a problem on our end, we aren't seeing issues with submission from other users or on the team. Have you given the submission another try this afternoon?
Alternatively, is there possibly an issue with a mismatch of the submission metadata and the object type, or some other clash? I see that there's a released biosample has an alias matching that Pulsar record ID in your error: https://www.encodeproject.org/biosamples/ENCBS447IFV/
Best, Ingrid
Hi Ingrid,
Here is the latest error message. The portal detects there is a conflict of repetitive file submission, but then it says the file cannot be found.
2022-03-01 12:43:28,942:eu_debug: <<<<<< POST file record michael-snyder:SREQ-430-GCCAAT_S3_L001_R1_001.fastq.gz To DCC with URL https://www.encodeproject.org/file and this payload:
{ "aliases": [ "michael-snyder:SREQ-430-GCCAAT_S3_L001_R1_001.fastq.gz" ], "award": "/awards/UM1HG009442/", "controlled_by": [ "ENCFF931PCB" ], "dataset": "ENCSR548FLW", "file_format": "fastq", "file_size": 2803379992, "flowcell_details": [ { "barcode": "GCCAAT", "lane": "" } ], "lab": "michael-snyder", "md5sum": "884b33332f913ffa08d374e8715c4e95", "output_type": "reads", "paired_end": "1", "platform": "encode:NovaSeq6000", "read_length": 100, "replicate": "1fd30eca-0e2d-4e85-a7eb-a87727b35acb", "run_type": "paired-ended", "submitted_file_name": "/oak/stanford/scg/prj_ENCODE/SREQ/430/SREQ-430-GCCAAT_S3_L001_R1_001.fastq.gz" }
2022-03-01 12:43:29,475:eu_debug: {'@type': ['HTTPConflict', 'Error'], 'status': 'error', 'code': 409, 'title': 'Conflict', 'description': 'There was a conflict when trying to complete your request.', 'detail': "Keys conflict: [('alias', 'md5:884b33332f913ffa08d374e8715c4e95')]"} 2022-03-01 12:43:29,476:eu_debug: >>>>>>GET michael-snyder:SREQ-430-GCCAAT_S3_L001_R1_001.fastq.gz From DCC with URL https://www.encodeproject.org/michael-snyder:SREQ-430-GCCAAT_S3_L001_R1_001.fastq.gz/?format=json&datastore=database 2022-03-01 12:43:29,624:eu_debug: NOT FOUND
Hi Tao,
It looks like the clash is via the md5sum (so it sees that identical file contents are being uploaded, and blocks that action), but the GET to try to see that file is using an alias that doesn't exist (the file on the portal has a very slightly different order of elements in its alias).
File on the portal with md5:884b33332f913ffa08d374e8715c4e95 has alias michael-snyder:SREQ-430-3-GCCAAT_S3_L001_R1_001.fastq.gz. The file you're trying to upload has the matching md5 but different alias, michael-snyder:SREQ-430-GCCAAT_S3_L001_R1_001.fastq.gz, which doesn't exist and that's why it can't be found.
Best, Ingrid
Hi Ingrid,
Thanks for identifying the root cause. Now I am confused, how do files of SREQ-430 already get submitted since SREQ-430 and SREQ-431 was swapped? We identified this mistake after winter break, and the conflicting file is from then SREQ-430 (before winter break, which should actually be SREQ-431).
Hi Annika,
After long-time careful sifting through the records, I have finally finished the most submission of ChIPs in this sheet: https://docs.google.com/spreadsheets/d/1Hj5Al2J_F0sUUolG-dUQZqpEFPEl-YUm/edit#gid=795058038
Right now, we are waiting for Ingrid’s notification on the completion of the fastq files. Some are to be moved from revoked (marked red) experiments, others are due to the swapping of SREQ-430 and SREQ-431 caused by Sequencing Center.
After that, IPs, PCRs and possible_controlls need to be manually submitted. It is a long way, but we are close to see the twilight.
Best, Tao
Hi Ingrid,
Please proceed to move rep 3 for the following two experiments:
from https://www.encodeproject.org/ENCSR101DNY to https://www.encodeproject.org/experiments/ENCSR279OXE/
from https://www.encodeproject.org/ENCSR727GPE to https://www.encodeproject.org/ENCSR724QJT
Thanks, Tao
cs-538 is completed https://www.encodeproject.org/experiments/ENCSR279OXE/
cs-537 is completed https://www.encodeproject.org/experiments/ENCSR724QJT/
All ChIPs in this bunch have been submitted successfully.
https://docs.google.com/spreadsheets/d/1Hj5Al2J_F0sUUolG-dUQZqpEFPEl-YUm/edit#gid=795058038
HI Tao,
Could you also check this spreadsheet. There are ChIP that can be submitted. I think we talked about those when we met last time.
Thanks, Annika