chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
37 stars 23 forks source link

Add mouse strain #956

Open brianraymor opened 2 months ago

brianraymor commented 2 months ago

Context

See cell-sci-data-wrangling.

References

There is no REST API, but the HTML has consistent anchor links for downloads which can be easily parsed:

<td>
--
  | <a id="textDownload" href="https://www.findmice.org/report.txt?repositories=JAX&results=12736"><img src="https://www.findmice.org/assets/images/text-document-icon.png" title="Download Text Report" /></a>&nbsp;
  | <a id="excelDownload" href="https://www.findmice.org/report.xlsx?repositories=JAX&results=12736"><img src="https://www.findmice.org/assets/images/Excel-icon.png" title="Download Excel Report" /></a>&nbsp;
  | 
</td>

Example processed output for report:

{'Nomenclature': '+',
 'Strain ID': 'JAX:005004',
 'Strain/Stock': 'ZRDCT Rax<ey1>/ChUmdJ',
 'Repository': 'JAX',
 'State': 'embryo',
 'Synonyms': 'ZRDCT Rax<ey1>/ChUmiJ',
 'Type': 'coisogenic strain',
 'Allele ID': 'MGI:1856862',
 'Allele Symbol': 'Rax<ey1>',
 'Allele Name': 'eyeless 1',
 'Gene ID': 'MGI:109632',
 'Gene Symbol': 'Rax',
 'Gene Name': 'retina and anterior neural fold homeobox',
 'URL': 'https://www.jax.org/strain/005004'}
{'Nomenclature': '+',
 'Strain ID': 'JAX:005004',
 'Strain/Stock': 'ZRDCT Rax<ey1>/ChUmdJ',
 'Repository': 'JAX',
 'State': 'embryo',
 'Synonyms': 'ZRDCT Rax<ey1>/ChUmiJ',
 'Type': 'mutant strain',
 'Allele ID': 'MGI:1856862',
 'Allele Symbol': 'Rax<ey1>',
 'Allele Name': 'eyeless 1',
 'Gene ID': 'MGI:109632',
 'Gene Symbol': 'Rax',
 'Gene Name': 'retina and anterior neural fold homeobox',
 'URL': 'https://www.jax.org/strain/005004'}
jahilton commented 2 months ago

Some notes from the scFAIR group...

brianraymor commented 2 months ago

There are Research Resource Identifiers for the mouse strains documented at IMSR.

For example, a JAX strain that references its RRID

JAX RRID Example

Samples: