Closed chasemc closed 2 years ago
PR allows creating mock contigs from an input set of Genbank assembly accessions. Also creates two minimal reports at end showing the binning results- one colored by genus (parsed from name) and one by assembly accession.
Example output: mock_data_reports.zip
Notes: Mock data reports should write out to the main output folder.
To run the pipeline with mock data set the parameter --mock_test true
A couple of things to fix (or not) before merging in (@WiscEvan I don't think I'll have time to do these today)
1) This process needs a docker image. @ajlail98 could maybe look around to find one? https://github.com/KwanLab/Autometa/blob/0ec00874d11238ae400fb1f3d72212dd38aa717f/modules/local/get_genomes_for_mock.nf#L10
2) This one also:
https://github.com/KwanLab/Autometa/blob/b73850e9c37ee49572c05af504040fd57d382a7d/modules/local/mock_data_reporter.nf#L14-L15
I have an example there but it has to be built first. Maybe that's okay if the mock_data is only going to be used by developers, where instructions to build the image first could be provided
Note: that dockerfile is a modified version of:
https://github.com/rocker-org/rocker/blob/master/r-rmd/Dockerfile
where procps
is also installed (required by Nextflow), so the Rocker project license would have to be included
3) Last- just a note that when I happened to run this with "GCF_013307045.1" it failed because of no markers found. May be worth looking into
:memo: I've added a tag to the GET_GENOMES_FOR_MOCK
process in get_genomes_for_mock.nf
so the user can easily tell how many genomes are being fetched for the mock community.
jason-c-kwan/autometa:dev
(docker build . -t jason-c-kwan/autometa:dev
) prior to running:nextflow run . -profile docker -params-file "nf-params.json" --mock_test true --input .
nf-params.json
{
"autometa_image_tag": "dev"
}
I've also added dockerfiles for the processes you've mentioned. I was not sure where to put these. I've opted to place them in a $HOME/Autometa/docker/modules
sub-directory. If you have guidance on where these should be placed, feel free to move them.. If you make these changes, the Makefile
command modules-images
will need to be updated to conform to these updated paths.
i.e. to build all autometa nextflow modules docker images from Makefile
make modules-images
A couple of things to fix (or not) before merging in
- This process needs a docker image. modules/local/get_genomes_for_mock.nf)
get_genomes_for_mock.nf
can be built with docker/modules/get_genomes_for_mock.Dockerfile
This one also: https://github.com/KwanLab/Autometa/blob/b73850e9c37ee49572c05af504040fd57d382a7d/modules/local/mock_data_reporter.nf#L14-L15
I have an example there but it has to be built first. Maybe that's okay if the mock_data is only going to be used by developers, where instructions to build the image first could be provided Note: that dockerfile is a modified version of: https://github.com/rocker-org/rocker/blob/master/r-rmd/Dockerfile where
procps
is also installed (required by Nextflow), so the Rocker project license would have to be included
I've used conda to create the R env
mock_data_reporter.nf
can be built with docker/modules/mock_data_reporter.Dockerfile
Part of but does-not-fix:
mention https://github.com/KwanLab/Autometa/issues/152