Open iprada opened 5 years ago
Hi Iñigo,
Thanks for trying out AA and the suggestion of using Numba. The runtime speedup is currently a non-trivial problem from a user perspective and would only be accessible to advanced users like you. It will be better to implement the runtime speedups internally rather than adding complicated user documentation for that. I will work on that in the near future.
For the parameter --extendmode
:
EXPLORE
CLUSTERED
with each bed file as input 1 at a time.CLUSTERED
. This enables parallelization.UNCLUSTERED
.For the parameter --runmode
:
Since you only want the breakpoint edges, you can use the value BPGRAPH
. This should not take too much longer than purely generating the breakpoint edges and currently AA does not have the option to stop right after breakpoint edge generation. However, if you think that is not the case, you can find intermediate files with filename format as: {out}_amplicon{ampliconid}_edges_cnseg.txt
in the output directory.
If it is still taking a while, try downsampling down to 10X coverage. AA should still be able to perform well after downsampling.
Let me know if this helps. Happy to hear more comments from you! Best, Viraj
Dear Viraj,
It is me again. I have manage to skip the
ZeroDivisionError
by filtering out the reads with low mapping quality. I am writting you to ask if you could provide some lines in the description of the software to tune the parameters to speed-up AmpliconArchitect. I am running it on my simulation set with 13000 circles simulated at 30X (that is around 10000000x2 Illumina raw reads) and the software has been running for 4h at the moment I am writting this.In my use case, I am only iterested on detecting the circular DNA breakpoints. I am not interested in reconstructing the fine structure of the amplicons. Hence, I expect that playing around with the parameters
--runmode
and--extendmode
might make the software run faster. If that is the case, could you provide a minimal explanation about how to set this parameters?PS: I have seen in you paper and in the source code that you have some very heavy numerical calculations. You are probably aware of this, but I think that AmpliconArchitect (As I have said before, I think that it is great that it can reconstruct the amplicons) could benefit a lot from just-in-time compilation of the main mathematical function using Numba. I have personaly obtained speed-ups of around 2 orders of magnitude for the numerical heavy functions of Circle-Map and the implementation effort has been almost minimal.
Best wishes,
Iñigo