lczech / grenedalf

Toolkit for Population Genetic Statistics from Pool-Sequenced Samples, e.g., in Evolve and Resequence experiments
GNU General Public License v3.0
34 stars 2 forks source link

Diversity by chromosomes output file - tiny issue? #9

Closed SheepwormJM closed 1 year ago

SheepwormJM commented 1 year ago

Hi guys,

This is a tiny issue (and may be deliberate) but thought I would flag it in case not.

The output diversity file (calculating all three measures and putting in one file), when run for chromosomes has the start and end of the chromosome as zero. Not sure if you had wanted to have the last base on each chromosome for the end instead. Just thinking it would make the number of snps and coverage more immediately relevant.

All the best, Jenni

lczech commented 1 year ago

Hi Jenni,

thanks for opening the issue - even the potentially tiny ones deserve attention!

In this case, it was mostly just a little implementation detail that held me back from reporting this in the output file. But I looked at it again, and it's not hard to fix. Will be in grenedalf soon, in two flavors: Report the last position that was found in the data as the end of the chromosome window, or, if a reference genome or dict/fai is given, report the length of the chromosome from there. I hope that makes sense ;-)

Just thinking it would make the number of snps and coverage more immediately relevant.

You mean, in the sense that this would allow comparison to the number of SNP positions that were used for the statistics? Yes, I can see that this would make it a bit easier. Still, I'd hope that you know the lengths of your chromosomes of the ref genome anyway :-D

Cheers and so long Lucas

SheepwormJM commented 1 year ago

Haha, thanks Lucas! Yes. Although it's just one less step for me... ;) so lazy I am...

lczech commented 1 year ago

Hey Jenni,

I've implemented this now, and it will be on the dev branch, and then in the next release v0.3.0. When providing a reference genome or dict/fai file, the length from there will be used. If not, the last position that was used for the computations of the command will be reported instead. Note though that similar to #10, this might not be the last position of the input file, as some commands already filter out "uninteresting" positions beforehand... right now that's a limitation of the implementation, but might be changed later. Hope that still works for you.

Cheers Lucas

SheepwormJM commented 1 year ago

Thanks Lucas!


Please note that I don’t expect you to read or respond to this email outside your normal working hours, or when on holiday. I will not normally check emails outside of working hours.

I currently work Monday to Friday and will aim to respond to you as soon as I can when I am in the office.

Jennifer McIntyre Rm 242, Henry Wellcome Building University of Glasgow Garscube Estate G61 1QH Tel: 0141 330 8216 / 07791 644 116 Github: https://github.com/SheepwormJM


From: Lucas Czech @.> Sent: 08 September 2023 08:27 To: lczech/grenedalf @.> Cc: Jennifer McIntyre @.>; Author @.> Subject: Re: [lczech/grenedalf] Diversity by chromosomes output file - tiny issue? (Issue #9)

Closed #9https://github.com/lczech/grenedalf/issues/9 as completed.

— Reply to this email directly, view it on GitHubhttps://github.com/lczech/grenedalf/issues/9#event-10314755866, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANH2WOSGLZXZB3V7M7VE5NLXZLCGPANCNFSM6AAAAAA3MV5YK4. You are receiving this because you authored the thread.Message ID: @.***>