Planteome / plant-stress-ontology

An ontology containing biotic and abiotic plant stresses. Part of the Planteome suite of reference ontologies. Formerly called the Ontology of Plant Stress
14 stars 9 forks source link

List of plant diseases from PHI-base #31

Open jseager7 opened 3 years ago

jseager7 commented 3 years ago

It would really help the Pathogen-Host Interactions database (PHI-base) to have a comprehensive ontology of plant disease, so I'm including a table of all plant diseases from PHI-base: specifically all diseases where the host species is part of the Viridiplantae kingdom (NCBITaxon:33090).

Note that the list has not yet been manually reviewed (the PHI-base team is planning to do this over the next month or so), but I'm posting it here first so you can check how suitable it is.

See below for a link to the table as a tab-separated file (I had to zip it because GitHub won't allow TSV files to be uploaded to issues).

phibase_plant_diseases.zip

Table columns

I've tried to follow the structure of the TSV file that contains the scrape of the APS website, but I've included some extra columns to help with manual review of the data:

Notes

As previously mentioned: to keep the disease names as general as possible, I've removed the host common name or genus from the disease name where it matches the host in the 'host' column, since the information seems redundant in these cases (although it's arguably less redundant when the disease name contains the genus of the host, since that specifies that the pathogen is not specific to the host).

I've kept the pathogen name in the disease name when it seems to be part of the accepted name for the disease (e.g. Fusarium ear blight). This might not be ideal if PSO plans to include the pathogen and host species with the disease name: if the term names were naively generated, PSO would end up with disease names like 'Fusarium graminearum Fusarium ear blight on Triticum aestivum', instead of something less redundant like 'Fusarium graminearum ear blight on Triticum aestivum'.

For cases where the host name in the disease doesn't match the host in the host column, I've retained the host name (usually the common name) in parentheses after the disease name. This is to help identify cases where a model host organism has been used instead of the 'natural' host; the model hosts are usually Nicotiana benthamiana or Arabidopsis thaliana.

The inclusion of model hosts could be problematic. I'm not sure whether PSO wants to make a distinction for model hosts, or whether they should be included at all. (While it's true that many pathogens can cause disease on tobacco leaves when inoculated, it may not be notable that they can do so.)

Many of the diseases in the list may be synonymous with each other; the PHI-base team is hoping to identify as many of these cases as possible when we manually review the list.


CC: @ValWood @CuzickA