mrvollger / SDA

Segmental Duplication Assembler (SDA).
MIT License
44 stars 6 forks source link

support CCS data #9

Closed Arkarachai closed 4 years ago

Arkarachai commented 5 years ago

Would it be possible to support CCS data as well? I think you would need to make an option to pass in preset mapping parameters keyword to Minimap in SDA.1.snakemake.py and SDA.2.snakemake.py, but I'm not sure if any other part of the program needs to be changed.

mrvollger commented 5 years ago

Hi Arkarachai, I am working on updates to allow for CCS data that I have not yet pushed.

I will update you when I have pushed them. Thought I will say in my preliminary analysis is that CLR data does better than HiFI with current parameters because the length of the read is very important in linking together distant PSVs and HiFi is quite a bit shorter.

Best, Mitchell

Arkarachai commented 5 years ago

Sounds good. Regarding your update. Would there be any change beside preset parameter for minimap? Also, do you have the rough timeline of the new release?

mrvollger commented 5 years ago

I have a pressing project at the moment that will be done at the beginning of next week. So I hope sometime next week.

Just an FYI minimap2 will align CCS reads pretty well with just the map-pb since it is forced to do dynamic programming alignments so if it is urgent you could just try it and see what happens.

The other changes I would like to make is that there is a statistical test for the fraction of reads that have to have PSVs in order for two PSVs to be linked that needs to be changed to reflect the error rate of HiFi instead of CLR.

mrvollger commented 4 years ago

It took much longer than I hoped to get back to this, but it is now done as of https://github.com/mrvollger/SDA/pull/10. Fortunately, this update comes with changes that should make SDA a bit easier to run.