R's integers have long precision, meaning they max out around 2.14 x 10^9. Many genomes (e.g. human) and other data sets that are desirable to compute N50s for (e.g. long-read sequence data) have cumulative sizes far in excess of this. For this function to be able to operate on these larger data sets, it needs to leave the data in numeric form (ie double precision).
I've created a pull request (#27 ) that removes the cast to integer which the function performs.
R's integers have long precision, meaning they max out around 2.14 x 10^9. Many genomes (e.g. human) and other data sets that are desirable to compute N50s for (e.g. long-read sequence data) have cumulative sizes far in excess of this. For this function to be able to operate on these larger data sets, it needs to leave the data in numeric form (ie double precision).
I've created a pull request (#27 ) that removes the cast to integer which the function performs.