Arcadia-Science / peptigate

Peptigate ("peptide" + "investigate") predicts bioactive peptides from transcriptome assemblies or sets of proteins.
MIT License
0 stars 0 forks source link

Consider outputting two TSV files instead of one from combine_peptide_annotations.R #16

Closed taylorreiter closed 6 months ago

taylorreiter commented 6 months ago

As currently written, the left join with peptide_predictions will duplicate all of the information in all of the left-joined dataframes for any peptides that appear more than once in peptide_predictions. Outputting two TSV files - one the predictions for peptides (in which peptide_id is nonunique) and the other the joined metadata dataframes (in which peptide_id is unique) - would avoid this. Outputting two would avoid duplication, but one might be easier to work with practically. We should assess this after running the pipeline a few times. if there is relatively little duplication in peptide_ids it might not matter in practice to separate these two files.

taylorreiter commented 6 months ago

done in #23