ElixirTeSS / TeSS_scrapers

TeSS HTML page scrapers in Ruby looking for training resources and events metadata.
Other
9 stars 9 forks source link

NGS Registry material endpoint needs to be one level up! #93

Open njall opened 8 years ago

njall commented 8 years ago

From the reviewer of the Bioinformatics App note:

Similarly some links lead to a confusing array of GitHub pages. https://tess.elixir-uk.org/materials/introduction-to-ngs#sub-modules is expected to show Introduction to NGS materials but it took a bewildering number of clicks (5 to get to Frederik_Coppens) to get to this material. Again, the user expectation is that it is one click away. How do the scrapers aggregate materials? What appears to be aggregated is the surface level of data and not the actual data, thus reducing the utility of TeSS.

True enough, our endpoint points here: https://microasp.upsc.se/ngs_trainers/Materials/blob/master/Content/HTS-introduction/README.md When to access the materials more readily it should be here either: https://microasp.upsc.se/ngs_trainers/Materials/tree/master/Content/HTS-introduction/Frederik_Coppens or here: https://microasp.upsc.se/ngs_trainers/Materials/tree/master/Content/HTS-introduction

anenadic commented 8 years ago

NGS registry scraper needs revisiting - try just to extract short and long description and try not to embed the whole page within TeSS, together with relative links etc. coming from the training page itself.

njall commented 8 years ago

Scraper needs to delete README.md from end of URL if present