Open rmd13 opened 5 years ago
HI! I can make one for the Bloomington stock center easily, I think. Let me do that over the weekend!
I just recalled that Bloomington stock center provides a very neat list of their stocks with corresponding genotypes. I think it is better to use the list rather than causing network traffic. Please check out the link below.
Yes I got the list csv file.
I found that the http://flybase.org/ has a quick search item called data class, and if I select Stock and input stock number, it will return a page with correct item. This can be used to search for any stock center, and it is really powerful. I've just learned your R code, and tring to do this via R. But I am a beginner, and may take long time to finish.
session <- html_session("http://flybase.org")
form.original <- html_form(session)[10] [[10]]:
Finally I made it: below is the full code:
library(rvest) session <- html_session("http://flybase.org")
form.original <- html_form(session)[10][[1]] #或者[[10]]
# [[10]]: 要的就是dataclass_form
# <form> 'dataclass_form' (POST /search/)
# <input hidden> 'fld': fbxx-?
# <input hidden> 'tab': dataType_tab
# <input hidden> 'caller': quicksearch
# <input hidden> 'species': Dmel
# <button submit> '<unnamed>
# <input radio> 'field': SYM #单选之: symbol/id?
# <input radio> 'field': ALLTEXT #单选之: all text?
# <select> 'data_class' [0/33] # 选stock,
# <input text> 'query': #输入查询id
stock_IDsQuery = c("CH321-94A02","7568","6367","24343","28827","150337","2363")
Stock_PlainGenoTypeAss <- rep_len("", length(stock_IDsQuery))
Stock_shortGenotypeAss <- rep_len("", length(stock_IDsQuery))
Stock_IDechoAss <- rep_len("", length(stock_IDsQuery))
i = 0;
for (aStock in stock_IDsQuery) {
i = i + 1;
form <- set_values(form.original, field = "SYM", data_class = "Stock", query = aStock)
result_raw <- submit_form(session, form)[[6]][[6]];
result <-as.character(rawToChar(result_raw));
pattern = "FBst\\w+\""; #终于搞定!
gregout <- gregexpr(pattern,result,ignore.case = F,perl = F,fixed = F)
if (!identical(-1L, gregout[[1]][1])) {
aHit1st = gregout[[1]]
aHit1stLen = gregout[[1]] + attr(gregout[[1]],'match.length') - 2
aStock_ID = substr(result,aHit1st,aHit1stLen)
aStock_Http = paste("http://flybase.org/reports/",aStock_ID, sep = "")
aStock_Html <- read_html(aStock_Http)
Stock_PlainGenoTypeAss[i] <- aStock_Html %>% html_nodes(".row:nth-child(6) .col-sm-9") %>% html_text()
# "w[1118]; Dp(3;2)GV-CH321-94A02, PBac{y[+mDint2] w[+mC]=GV-CH321-94A02}VK00037"
aStock_IDre_ <- aStock_Html %>% html_nodes(".row:nth-child(4) .field_label+ .col-sm-height") %>% html_text()
Stock_IDechoAss[i] = aStock_IDre_[[1]]
# [1] "FBst0550356"
Stock_shortGenotypeAss[i] <- aStock_Html %>% html_nodes(".row:nth-child(7) .col-sm-9") %>% html_text()
# "w1118; Dp(3;2)GV-CH321-94A02, PBac{GV-CH321-94A02}VK00037"
}
}
OMG, this is awesome. Would you mind that I incorporate your method somehow to the original script? I am not that much used to about this github things, so I don't know how I can invite you to contribute/edit.
Also, I would like to write a code that accesses Flybase "Batch Download" tool as well. Please stay tuned.
No problem you can insert to your code.
Dear hangnoh Is it possible to write a function to input the fly stock number and get the plain genotype using R? I have a list of fly stock number and majority of them are from Bloomington, and some of them from Tokyo, and for all these stock number I can find information on flybase. For example, input stock ID 11572 get the genptype: P{ry[+t7.2]=PZ}frc[02619] ry[506]/TM3, ry[RK] Sb[1] Ser[1]
Thanks
Originally posted by @rmd06 in https://github.com/hangnoh/flybaseR/issues/1#issuecomment-449247772