Closed tinyheero closed 3 years ago
Hi Fong,
The reference generation is currently integrated within the modules of HISATgenotype. Version 1.3.1 (coming out within a few weeks), removes this requirement and adds and option to pre-build all indices needed at install within the HISATgenotype install folder or, if one prefers, adds the option of building this in any data directory of your choice during the first run of HISATgenotype on the system. Will this new system meet your needs?
Thanks, Chris
That's great to hear that the newer version will decouple the reference generation from the HISATgenotype modules. Regarding the two options:
Is there not an option to have a hybrid of the two options? Where you can pre-build your references before any runs, but specify the data directory to store it. This way any runs will use that data directory.
If not, I guess one can conceptually think of the first run of HISAT-genotype being the reference generation step. One could use the example data provided in the tutorial to initiate this step. Then the reference data can be stored and reused on different clusters/machines. Is that thought process correct?
Hi Fong,
Sure thing! It can certainly be decoupled completely. I'll add some additional options or an independent wrapper/script to build/download the references before running hisatgenotype. Then it is a matter of specifying that directory during each run of hisatgenotype using the new syntax I have added to v1.3.1. I'll draft a script next week, test it, then integrate it into the new release.
Thanks, Chris
Amazing! Thanks Chris!
Hi Fong,
The new version of HISATgenotype (1.3.1) has been released and has a new option to direct HISATgenotype to an index folder. You should now only have to download the index once and only at install if you desire. The manual will be updated with these changes soon. Let me know if you have any issues getting things added to a Docker image in the meantime. Thanks!
Thanks, Chris
Hi there,
I've noticed that when running HISAT-genotype it will first download and prepare a set of references if it doesn't already exist. I am interested in figuring out how to separate the reference generation from the actual running of the genotyping step. This would ease the integration of HISAT-genotype into a larger workflow.
I assume that the wrapper
hisatgenotype
calls a set of underlying scripts to prepare the references. Is it possible to call these underlying scripts independently of the wrapper to generate the required references? If so, what scripts should one be calling?