abundance table as output

Hi @rjsorr,

For any downstream processing, you don't want to use the HTML output but Recentrifuge's "extra" output or the pickled (serialized (output). These are the relevant options for you:

-e OUTPUT_TYPE, --extra OUTPUT_TYPE
                        type of extra output to be generated, and can be one
                        of ['FULL', 'CSV', 'MULTICSV', 'TSV']
  -p, --pickle          pickle (serialize) statistics and data results in
                        pandas DataFrames (format affected by selection of
                        --extra)

With FULL you will get an Excel file with all the information (one spreadsheet for statistics, another for all the data, as explained in the paper, in the manual, and in the wiki), with 'CSV' you will get a single CSV file, and likewise with 'TSV' you will get a single TSV file. In addition, you have the option to generate one file per sample by using the --extra MULTICSV or just -e MULTICSV option in rcf, so that with 'MULTICSV' you will get one CSV file per sample.

Finally, if you are processing Recentrifuge's results via a custom code downstream, you may take advantage of the --pickle flag. With that, rcf will pickle (serialize) both the statistics and data results in pandas DataFrames contained in a compressed pickle file. Be aware that the specific format of the DataFrames are affected by the selection of any relevant options, such as --extra.

khyox / recentrifuge

abundance table as output #41