Bohdan-Khomtchouk / Biochat

Natural language processing of Gene Expression Omnibus data
http://www.biochat.ai
MIT License
49 stars 13 forks source link

GDS: "Technology type: in situ oligonucleotide" --> microarray #2

Closed Bohdan-Khomtchouk closed 6 years ago

Bohdan-Khomtchouk commented 6 years ago

Most of the GDS entries are "Technology type: in situ oligonucleotide", which means it's a microarray sample. We should add this to the metadata field. Just scrape the Technology type metadata field for all GDS entries and replace "in situ oligonucleotide" with "microarray" in the UI.

vseloved commented 6 years ago

@Bohdan-Khomtchouk Can you point to a particular record with a "Technology type" field as I can't find any neither in GDS, nor in GSE?

Bohdan-Khomtchouk commented 6 years ago

@vseloved Each GDS entry (e.g., https://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS10) has a Platform metadata field with a hyperlinked GPL accession ID (e.g., GPL24), which is where you will find the Technology type metadata field which, in the case of GDS10, says "in situ oligonucleotide". This means it's a microarray data record.

N.B.: According to the Reference Series metadata field, GDS10 corresponds one-to-one with GSE11, which you can ascertain by looking at the identical publication (i.e., citation info). When you click on GSE11 you see that the Experiment type is "Expression profiling by array". This also means it's a microarray data record.

In summary, both "Expression profiling by array" (for GSE data records) and "in situ oligonucleotide" (for GDS data records) should be coded in Biochats as "microarray".