JRaviLab / MolEvolvR

An R Package for characterizing proteins using molecular evolution and phylogeny
https://jravilab.github.io/MolEvolvR/
Other
6 stars 16 forks source link

Eskape case studies #82

Closed Cateline closed 1 month ago

Cateline commented 1 month ago

Description

What kind of change(s) are included?

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.

jananiravi commented 1 month ago

Assigned this to @AbhirupaGhosh (primary) and @epbrenner @the-mayer (secondary).

Abhirupa/Evan/David, along with the script, could you also check if this is the right file format we want to use? Thanks!

AbhirupaGhosh commented 1 month ago

Refer to this comment for guidance.

Title: Process CARD Data, Map Short Names, and Run MolEvolveR

  • Download CARD Data: Retrieve the latest CARD dataset. (DOWNLOAD)
  • Open ARO_index.tsv: Parse the file (in R).
  • Map CARD Short Name: Map the CARD Short Name column to shortname_antibiotics.tsv and shortname_pathogens.tsv. The CARD Short Name values follow the format pathogen_gene or pathogen_gene_drug.
  • Sort and Group the data by pathogens and antibiotics.
  • Filter Favorite Bug-Drug or Bug for further analysis.
  • Download FASTA Sequences for the list of protein accessions filtered. (use Entrez)
  • Run MolEvolvR: Run the protein sequences through the MolEvolvR tool for evolutionary analysis.