JensUweUlrich / ReadBouncer

Fast and scalable nanopore adaptive sampling
GNU General Public License v3.0
33 stars 2 forks source link

Introduce TOML configuration file #22

Closed JensUweUlrich closed 2 years ago

JensUweUlrich commented 2 years ago

Read/Write configuration file for ReadBouncer including all parameter settings in order to improve the reproducibility of experiments. Start ReadBouncer by using only the TOML configuration file.

JensUweUlrich commented 2 years ago

Template for configuration TOML file

[General]
usage = "build", "deplete", "target", "classify" # atm only one of those
output_dir = path/to/write/output/files/to => all generated output files will be stored here
log_directory = path/to/write/log/files/to [IbfClassificationLog.txt, InterleavedBloomFilterLog.txt, NanoLiveLog.txt, ReadUntilClientLog.txt]

[IBF]
kmer_size = X (unsigned integer with default 13) => only required for 'usage = "build"' or if target_file/deplete_file is a fasta formate file 
fragment_size = X (unsigned integer with default 100000) => only required for 'usage = "build" or if target_file/deplete_file is a fasta formate file 
threads = X (unsigned integer with default 3)
target_files = xxx.fasta/xxx.ibf => can be a comma-separated list of fasta or ibf files; automatically create ibf file for every given fasta file
deplete_files = xxx.fasta/xxx.ibf => can be a comma-separated list of fasta or ibf files; automatically create ibf file for every given fasta file
read_files = xxx.fasta/xxx.fastq => can be a comma-separated list of fasta or fastq files; only required for 'usage = "classify"'
exp_seq_error_rate = 0.1 (unsigned float between 0 and 1 default 0.1) => ignored for 'usage = "build"'
chunk_length = X (unsigned integer with default 250)
max_chunks = X (unsigned integer with default 5)

[MinKNOW]
host = xxx.xxx.xxx (ip address or name of the computer hosting MinKNOW) => always do automatic connection test before everything else if 'usage = "deplete"' or 'usage = "target"'
port = X (port number used fo grpc communication by by MinKNOW instance)
flowcell = X (name of the flowcell used)

[Basecaller]
caller = DeepNano/Guppy (default is DeepNano)
host = (ip address or name of the computer hosting Guppy Basecall Server) => always do automatic connection test before everything else if 'usage = "deplete"' or 'usage = "target"'
port = X (port number on which the basecall server is running on the host)
threads = X (unsigned integer with default 3)