bcgsc / straglr

Tandem repeat expansion detection or genotyping from long-read alignments
Other
50 stars 9 forks source link

TRF options #15

Closed fabienkst closed 1 year ago

fabienkst commented 1 year ago

Hello !

I have been using your tool for a few months now, and I've been wondering for some time if it would be possible to modify the parameters for running TRF ? From what I've seen in the source code, it's fixed with this string: '2 5 5 80 10 10 500 -d -h' (it's the default value for the initialization and it's the same value in the "make_trf_sensitive" function)

Perhaps I am wrong, and this is intended for optimized results...

I thank you sincerely for your understanding.

readmanchiu commented 1 year ago

Yes, I can make the TRF parameters ("2 5 5 80 10 10 500") a Straglr parameter that users can change, with the default set to what is currently hard-coded. Are you using v1.2? I'm asking because the current code in the main repository has some other changes that are going to make up the next PR once testing is all done. If you are using v1.2 and not ready to try on the new version, I can introduce this option to the branch 1.2a which I have created to address a minor bug.

fabienkst commented 1 year ago

Hello ! Thank you for your response.

I have been using v1.2 until yesterday, as I have noticed that you have committed a version change in v1.3.0a. As of now, I have reinstalled the tool with the new version. If the changes are not too important, I can go back to the v1.2.

Best regards.

readmanchiu commented 1 year ago

@fabienkst, I've made trf_args an argument you can now specify when you run straglr.py.

--trf_args Match Mismatch Delta PM PI Minscore MaxPeriod
                        tandem repeat finder arguments. Default:2 5 5 80 10 10 500

It's a 7-integer argument, representing the Match Mismatch Delta PM PI Minscore MaxPeriod parameters, as per https://tandem.bu.edu/trf/trf.unix.help.html (The "-d -h" are required to keep the output format Straglr processes)

Hope it serves your needs, and if you find better parameters please share with us!