Closed CristinaZb closed 1 year ago
The manpage describes the log file and its different columns:
-l, --log filename
Output file for OTU merging statistics (18 columns separated by tabulations). OTUs are
processed in no specific order. For a given query OTU with potential parents, mumu
will order potential parents by decreasing similarity with the query OTU, then by de‐
creasing abundance, then by decreasing incidence (or spread), and finally by names
(increasing ASCIIbetical order). Each potential parent is tested, and the search stops
if parenthood criteria are matched or if the list is exhausted. The different columns
correspond to:
1. name of query OTU.
2. name of potential parent OTU.
3. percentage of similarity (float value ranging from 0 to 100).
4. total abundance of the query OTU (sum through all samples, positive inte‐
ger).
5. total abundance of the potential parent OTU (sum through all samples, posi‐
tive integer).
6. overlap abundance of the query OTU (sum through all samples where the po‐
tential parent OTU is also present, positive integer).
7. overlap abundance of the potential parent OTU (sum through all samples
where the query OTU is also present, positive integer).
8. incidence of the query OTU (number of samples where the query OTU is
present, positive integer).
9. incidence of the potential parent OTU (number of samples where the poten‐
tial parent OTU is present, positive integer).
10. incidence of the potential parent OTU (number of samples where both the po‐
tential parent OTU and the query OTU are present, positive integer).
11. smallest abundance ratio (for each sample, compute the abundance of the po‐
tential parent OTU divided by the abundance of the query OTU, find the
smallest value, float).
12. sum of the abundance ratios (positive integer).
13. average value of abundance ratios (float).
14. smallest non-null abundance ratio (exclude ratios for samples where the
query OTU is present but not the potential parent OTU, float).
15. average value of non-null abundance ratios (exclude ratios for samples
where the query OTU is present but not the potential parent OTU, float).
16. largest ratio value (float).
17. relative co-occurence value (number of samples where both the potential
parent OTU and the query OTU are present divided by the number of samples
where the query OTU is present, float).
18. status: 'accepted' or 'rejected'. The potential parent OTU is either ac‐
cepted as a parent, or rejected.
Abundance and incidence values in the log file correspond to the values in the origi‐
nal input table. Abundance and incidence values can be updated only when the whole
dataset has been processed and all potential parents are known.
Also, to avoid circular linking among OTUs with the same abundance values, merging is
only possible with parent OTUs that are strictly more abundant than the query OTU. For
instance, OTUs of abundance one can only be merged with OTUs of abundance > 1.
Also, why the number of "accepted" match does not agree with the number of final ASVs?
accepted
means that this particular ASV will be merged with a parent ASV. So, your initial number of ASVs, minus the number of accepted merges, should give you the final number of ASVs.
For example, with a test dataset:
mumu \
--otu_table tmp.table \
--match_list tmp.match_list \
--log log_file \
--new_otu_table tmp3 \
--minimum_match 84 \
--minimum_ratio_type min
# check results
grep -c "accepted$" log_file
parse OTU table... done, 29199 entries
parse match list... done
sort lists of matches... done
search for potential parent OTUs... done
merge OTUs... done
update spread values... done
write new OTU table... done, 19897 entries
9302 # accepted in log file
19897 + 9302 = 29199, as expected.
Thank you for the information, what you explain about accepted combinations makes complete sense.
I've noticed a minor bug in the log file (column 6: incorrect computation of the query overlap abundance when there is no overlap). It has no effect on the merging results, but you might want to install the new mumu 1.0.1 if you want to use the log file for visualization or exploratory stats.
Sorry for my ignorance but I'm not able to upgrade the package, the terminal says 'Unable to locate mumu package' Actually, the mumu --help command works to me, but the man mumu does not, any idea what is going on?
No problem.
To install mumu
for the first time (assuming git
is installed on your Linux system, in a terminal):
git clone https://github.com/frederic-mahe/mumu.git
cd ./mumu/
make
make check
sudo make install # to install the binary and the manpage
To read the man
page:
man mumu
To upgrade mumu
:
cd ./mumu/ # go back to your mumu folder
make dist-clean
git pull
make
make check
sudo make install # to install the new binary and the new manpage
Dear @frederic-mahe While log file exploration I'm missing column headers, I do not known what each column refers to. Also, why the number of "accepted" match does not agree with the number of final ASVs?