ksahlin / isONcorrect

Error correction of ONT transcript reads
GNU General Public License v3.0
58 stars 9 forks source link

Racon CUDA Interface for isONCorrect #10

Open jkbenotmane opened 3 years ago

jkbenotmane commented 3 years ago

Hi @ksahlin ,

The Racon cuda interface does have a different interface but I think it should be easily adoptable and cuda correction could accelerate and improve the resulting correction.

The command in line 251 create_augmented_reference.py would then be sth like this I think:

subprocess.check_call(['/Path/to//racon/build/bin/racon -c 16 -b --cudaaligner-batches 16 ', reads_to_center, read_alignments_paf, center_file], stdout=racon_polished, stderr=racon_stderr)

Therefore run_isoncorrect either sticks with expecting racon in PATH or allows a more finegrained parameter tuning of racon through the args.

Maybe allow something like

--use_racon "Path to Racon" --racon_params "-c 16 -b --cudaaligner-batches 16 ...."

Though I am also not sure, if this could interfere with other lines of code.

Originally posted by @jkbenotmane in https://github.com/ksahlin/isONcorrect/issues/9#issuecomment-900280835

ksahlin commented 3 years ago

Thanks for the pointers on how to run the cuda version!

The speedup achieved by this cuda integration depends on how large a fraction of the total runtime is occupied by racon.

From previous experiments, I think racon occupied around 20-25% of the total runtime. So even if, say, racon-cuda is 10x(?) faster than racon, it would only mean a speedup of ~20% for isONcorrect.

Some more profiling is needed to check if this is worth it.

jkbenotmane commented 3 years ago

Makes Sense to me.

I may not have fully overseen the impact on the Rest of Code.

jkbenotmane commented 3 years ago

Might the CUDA Support be interesting for the future or did Evaluation show any major obstacles when implementing? (I saw an index out of Range Error when trying to use it)

ksahlin commented 3 years ago

I won't have time to look into this in the near future. But happy to take pull requests.