Closed kevinkle closed 6 years ago
For future reference, case we ever need it:
ubcsamsung [5:32 PM]
Hi Kevin
kevin [5:33 PM]
Howdy
ubcsamsung [5:33 PM]
Could you tell me how you get the filename when you download the genome data from enterobase?
[5:33]
There seems to be some mismatch
kevin
[5:34 PM]
Hmm
[5:34]
so we take the `barcode` value under the `experiment` dictionary as the filename
[5:35]
(after appending `.fasta`
[5:35]
There are checks for files which aren’t assembled or are not found on enterobase
[5:35]
Namely lines `52` and `9`
[5:36]
can you elaborate on the mismatch?
ubcsamsung [5:37 PM]
We are trying to match the serotype data from enterobase to the genome we have
[5:37]
For example
[5:38]
Hmm give me a sec
kevin
[5:38 PM]
If im not mistaken
[5:39]
this would be related to the difference between the `barcode` name in the `experiment` dicts vs the `strains` dict
ubcsamsung [5:39 PM]
yes
kevin
[5:39 PM]
for example, under `strains`, this might be `'ESC_AA7740AA` and `ESC_CA1647AA_AS` under `experiment`
ubcsamsung [5:40 PM]
there seems to be some difference between assembly barcode and just barcode
kevin
[5:41 PM]
try backtracing the `barcode` in `experiment` to its `id`
[5:42]
this should give you a match to the row in `strains`
ubcsamsung [5:42 PM]
where is the experiment file?
kevin
[5:42 PM]
ie., barcode `ESC_AA7740AA` and `ESC_CA1647AA_AS` both use id `7740`
[5:43]
you can start up a python interpretive environment (or use a script, if you’d like) and run
``` r = requests.post('http://enterobase.warwick.ac.uk/get_data_for_experiment', data=options)
d = r.json()
# d.keys()
# [u'strains', u'experiment']
strains = d['strains']
experiment = d['experiment']```
[5:43]
after running `import requests`, ofc
ubcsamsung [5:45 PM]
options is?
kevin
[5:45 PM]
ah right sorry
[5:45]
this is from https://github.com/superphy/backend/blob/master/scripts/enterobase.py
GitHub
superphy/backend
Semantic superphy backend for distributing predictive genomics tasks
[5:45]
where
``` options = {
'no_legacy':'true',
'experiment':'assembly_stats',
'database':'ecoli',
'strain_query_type':'query',
'strain_query':'all'
}```
[5:46]
just mimicks the behavior of the `GET` request
Tests are passing as of https://github.com/superphy/backend/commit/e6aa5b75dd322b12bd338bffb55592ec23fb239a
Example of the metadata file expected is provided in https://github.com/superphy/backend/blob/218-metadata/app/tests/example_metadata.xlsx
Will test via reactapp now.
Merged, closing issue.
this can be seen as a followup to https://github.com/superphy/backend/issues/210