Closed josiahseaman closed 4 years ago
partitions, bin2file_mapping = schematic.split(args.cells_per_file)
is a good place to start.
chunk05.fasta
@mandosoft Thomas is working on this issue.
Bin math: Bin 0 is a reserved meaning. Biologists count nucleotide index starting at 1, not 0. That means with bin width 1,000, position 1 -1,000 is in bin 1. Target position X is in Bin label ceil(X /1,000). Fasta position in file (0 indexed) for Bin Y = [(Y - 1) 1,000 : Y 1,000] non-inclusive of the last nucleotide (default behavior in Python and Javascript). Nucleotide Index Z for Pangenome is Fasta[Z-1].
Depends on: (https://github.com/vgteam/odgi/issues/88) component_segmentation read in single FASTA file. It chunks up FASTA in parallel with component dividers and assigns corresponding names for (https://github.com/graph-genome/Schematize/issues/17)