LadnerLab / PepSIRF

PepSIRF: Peptide-based Serological Immune Response Framework
GNU General Public License v3.0
7 stars 2 forks source link

Add option for user-defined, tab-delimited ID name map #218

Open jtladner opened 1 year ago

jtladner commented 1 year ago

The "--id_name_map" option currently requires that the user provide a version of the 'rankedlineage.dmp' file downloaded from NCBI. This works well for NCBI-supported taxIDs, but does not support custom taxIDs.

To support custom taxIDs, let's add an option for the user to provide a simple tab-delimited file that will link taxIDs to taxon names, as opposed to the NCBI formatted file (the format of which is more complex.

jtladner commented 3 months ago

Related to this, it appears that there's a bug when the user does NOT provide a 'rankedlineage.dmp' file using the '--id_name_map' flag. In this situation, the "Species Name" column is removed from the header line, but the column still exists for the remaining rows, just with blanks.

Here is an example of the output when this flag is not provided: 399204_PM1_pA-pG_B~399204_PM1_pA-pG_A_enriched.txt

And here is the same version when this flag is provided: 399204_PM1_pA-pG_B~399204_PM1_pA-pG_A_enriched.txt