superphy / spfy

Spfy: an integrated graph database for real-time prediction of Escherichia coli phenotypes and downstream comparative analyses
Apache License 2.0
4 stars 2 forks source link

error when headers are `>NODE_2_length_339420_cov_157.209` #225

Closed kevinkle closed 6 years ago

kevinkle commented 6 years ago is incorrect, this is an error with spfy in some way. with command (backend) ubuntu@host-10-1-5-81:/opt/backend/app$ python -m modules/qc/qc -i tests/headers/ESC_AA7855AA_AS-error.fasta returns True, so graphing and SeqIO should be working ok for this header name

adding sample| prefix so >sample1|NODE_2_length_339420_cov_157.209 should fix this

chadlaing [1:32 PM] 
hey @kevin, I received two test files from our Collaborator and tried one of them on `spfy`. Initially it said QC passed, but when I checked later the job had failed and it said QC failed. Id is `Job failed. Key: 81de98df-938b-4628-b0c3-5fb6559f4f15 /`

kevin [1:58 PM]

Looks like this is linked to our blazegraph issue

chadlaing [2:00 PM] 
ok, thanks

is that waiting for eg. the AMR to be computed, or simply a query of the database

kevin [2:03 PM] 
It's trying to lookup the spfyid for an uploaded file and the db isn't responding

[2:07 PM] 

three minutes with no response? very strange

[2:16 PM] 
do you have any idea why it would consistently fail with one file but not another?

chadlaing [2:24 PM] 
anecdotally, if I take the failed file that has headers like so: '>NODE_2_length_339420_cov_157.209' and edit them so they all share a common name like `>sample1|NODE_2_length_339420_cov_157.209` then everything works as expected, and quickly to boot

[2:25 PM] 

That would suggest another cause of this error

I'll try adding some parsing checks to the qc module tonight and run the same setup

chadlaing [2:29 PM] 
ok -- in the mean time I can suggest that she add an ID to her inputs
kevinkle commented 6 years ago

I wonder if this is related to the command results = sparql.query().convert()

kevinkle commented 6 years ago

Unable to replicate this issue locally using files in

kevinkle commented 6 years ago

Read the comment wrong...

kevinkle commented 6 years ago

on corefacility:

ESC_AA7855AA_AS-works.fasta with pi: 90 for Serotype VF
Submitted: 12:04:36 PM, Status: FAILED
ERROR WITH JOB: blob345709520386712084
Traceback (most recent call last): File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/", line 700, in perform_job rv = job.perform() File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/", line 500, in perform self._result = self.func(*self.args, **self.kwargs) File "./modules/blazeUploader/", line 138, in write_reserve_id spfyid = reserve_id(query_file) File "./modules/blazeUploader/", line 121, in reserve_id largest = check_largest_spfyid() File "./modules/blazeUploader/", line 62, in check_largest_spfyid results = sparql.query().convert() File "/opt/conda/envs/backend/lib/python2.7/site-packages/SPARQLWrapper/", line 567, in query return QueryResult(self._query()) File "/opt/conda/envs/backend/lib/python2.7/site-packages/SPARQLWrapper/", line 537, in _query response = urlopener(request) File "/opt/conda/envs/backend/lib/python2.7/", line 154, in urlopen return, data, timeout) File "/opt/conda/envs/backend/lib/python2.7/", line 429, in open response = self._open(req, data) File "/opt/conda/envs/backend/lib/python2.7/", line 447, in _open '_open', req) File "/opt/conda/envs/backend/lib/python2.7/", line 407, in _call_chain result = func(*args) File "/opt/conda/envs/backend/lib/python2.7/", line 1228, in http_open return self.do_open(httplib.HTTPConnection, req) File "/opt/conda/envs/backend/lib/python2.7/", line 1201, in do_open r = h.getresponse(buffering=True) File "/opt/conda/envs/backend/lib/python2.7/site-packages/raven/", line 346, in getresponse rv = real_getresponse(self, *args, **kwargs) File "/opt/conda/envs/backend/lib/python2.7/", line 1121, in getresponse response.begin() File "/opt/conda/envs/backend/lib/python2.7/", line 438, in begin version, status, reason = self._read_status() File "/opt/conda/envs/backend/lib/python2.7/", line 394, in _read_status line = self.fp.readline(_MAXLINE + 1) File "/opt/conda/envs/backend/lib/python2.7/", line 480, in readline data = self._sock.recv(self._rbufsize) File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/", line 51, in handle_death_penalty 'value ({0} seconds)'.format(self._timeout)) JobTimeoutException: Job exceeded maximum timeout value (180 seconds)
ESC_AA7855AA_AS-error.fasta with pi: 90 for Serotype VF
Submitted: 12:04:50 PM, Status: COMPLETE
JobId: blob2384235990117585805



ESC_AA7855AA_AS-works.fasta with pi: 90 for Serotype VF
Submitted: 12:05:09 PM, Status: COMPLETE
JobId: blob4062748484315444847

ESC_AA7855AA_AS-error.fasta with pi: 90 for Serotype VF
Submitted: 12:05:16 PM, Status: COMPLETE
JobId: blob6882864231611697523

kevinkle commented 6 years ago

Able to replicate on corefacility. (works) (doesn't work)

However, both files work locally.

kevinkle commented 6 years ago

corefacility is running on f4befb9

kevinkle commented 6 years ago Can't see anything that would be diff. from master

kevinkle commented 6 years ago

On Cybera, both files work as well when running master. Need to test if issue is related to or corefacility

kevinkle commented 6 years ago

Unable to replicate error with on Cybera.

kevinkle commented 6 years ago

Looks like this wasn't related to Spfy. We have moved corefacility's backend temporarily to cybera which bypasses the blazegraph issue. Will to see if this fixes that issue. Closing issue.