Open a-khalak opened 5 years ago
@cschin
Ideally, the hello world case would involve
The idea is that the user should be pretty far along in terms of being able to self-serve in running their own data by more or less copying the hello world example.
Just put the reads in a public location on S3.
Can you see the e. coli K-12 fasta files from the following location (supposedly public)?
asif@dockerbox:~ $ aws s3 ls s3://biologicaldatascience.org/data/ecoli-k12/ 2019-06-02 23:22:38 0 2019-06-02 23:23:21 4706062 K12MG1655.fa
This only works if you have an AWS account and cli tools installed. So, the following should work even without anything installed.
https://s3.amazonaws.com//biologicaldatascience.org/data/ecoli-k12/K12MG1655.fa
@sifta I agree that it will be good to have a demonstration script. It will be tricky to package that pithing the docker image now. Let me think about it. One challenge is that users' data can be quite arbitrary and may not work all the time.
@cschin
could you show some downstream best practise ?
like polish by illumina reads or connect ctg to scaffolds?
@huangl07 I will move your question to a new issue as it is off topic of the current issue.
For those users that are primarily interested in running Peregrine and mainly tweaking hyperparameters, it would be helpful to have a 'hello world' runbook that walks through setting up the required inputs and running through a test case on sample data (e.g. e. coli K12).
Ideally, this would involve confirming system dependencies, obtaining some canned reads (.fasta and .lst files), running peregrine from latest stable dockerhub build, and confirming the resulting assembly.