Closed UnixJunkie closed 4 months ago
number of uniq ones, but also how many times each one was seen
maybe, a one time pass on a set of fragmented molecules using an rdkit python script can do the job; it will create a dictionary of frag_smi to cano_frag_smi plus cano_frag_smi to unique id
or, maybe a special option in fasmifra to spit out all the encountered fragments in a file; then we'll process them w/ a Python rdkit script for canonicalization and counting
fasmifra exe now has the -of option; we need the rdkit python canonicalization script
there is the bin/fasmifra_frag_dict.py script now, to postprocess the output of fasmifra's -of new option
for each fragment: