FDA-ARGOS / data.argosdb

MIT License
3 stars 7 forks source link

Christie- HIVE3 HIVEIDs in computation do not work due to downloader engine issue #435

Closed cwoodside1278 closed 5 months ago

cwoodside1278 commented 6 months ago

This is what I sent to Vahan for context and the blocker:

I am continuing to troubleshoot HIVE3 and wanted to explain to you what I am seeing. If you have time tomorrow to discuss that would be great.

I used Yersinia enterocolitica, SRR12934926 (which has 2 fastq files), SAMN16357259, and GCF_000009345.1

  1. ran the brand new (to the HIVE3 system for my account) organism and had zero computational issues. id 1662 for the folder, 1661 workflow
  2. ran the same organism again inputting all the raw IDs and had no issues. id 1673, 1674
  3. Moved the SRR information to another folder outside of the original, had no issue and it ran to completion. 'yersinia moved files' is the name and id 1678. I moved the ERR folder into there and out of the directory of the first one.

I then ran the organism again, but using HIVEID SRRs (the two that were created), manually inputted assemblyID, and manually inputted Biosample. and its started to slow down... specifically in the downloader engine. I do not know why the downloader engine would be running, when the ERR's were already selected AND stored in HIVE3. It should not have to look for them correct?

I think noticed this: I grabbed the HIVEIDs which were SRR12934926_1 and _2 for yersinia entero, but when the downloader engine finished it grabbed and created a folder for SRR12934927_1 which is only one fastq file by itself. It did not even use the ERR's I selected and it went and searched for another SRR file...

It didn't even bother to use the two fastq files that I clicked on as HIVEIDs... also interesting that it only grabbed the one because there are roughly 20 other SRR ids for this organism. Also for this organism, some SRR ids even have up to 30 fastqs assigned to them. So why did it grab the one with the single fastq file and not an SRR id with many many fastqs? Why did the downloader engine even run at all?

Additional bugs:

Thank you!

cwoodside1278 commented 5 months ago

The issue was dealt with. Vahan had parsed the data with a ; instead of a , and this messed up the list generation. Problem has been resolved