carpentries / amy

A web-based workshop administration application built using Django.
https://amy.carpentries.org
MIT License
114 stars 72 forks source link

Amy Event URL Validation #1572

Closed Talishask closed 4 years ago

Talishask commented 5 years ago

I believe we recently updated the validation rules for the workshop website (URL) in AMY to accept any website format. However, when trying to 'Update from URL' I am getting this error message (see thread). Here is the AMY event: https://amy.carpentries.org/workshops/event/2019-11-26-Potsdam-Berlin/ new github page: http://swc-bb.gitext.gfz-potsdam.de/pages/2019-11-26-Potsdam-Berlin/ New Repo: https://gitext.gfz-potsdam.de/swc-bb old github page: https://swc-bb.github.io/2019-11-26-Potsdam-Berlin-R/ (edited)

Here is our slack conversation: Maneesha: AMY will accept any URL in the event's record. However to use "Update from URL" AMY needs to know the url of the github repo it is from. If the website url is in the typical format (username.github.io/YYYY-MM-DD-sitename) then AMY can figure out the github repo. If the URL is in a different format, then AMY can't figure out the github repo and therefore can't update from URL. Let me know if that's clear. François:I thought (and a quick inspection of AMY's code suggests this) that the reading of the meta data about the workshops was done by looking at the meta tags in the HTML version of the website and should therefore be independent of the whether the repository is hosted on GitHub or somewhere else. This validation might be unnecessary. Maneesha: Thanks for that @francois We should then be able to make this work if they are still using our template, even if the URL isn't in the standard format. If they are not using our template then it may not have the meta tags and therefore can't update from URL (example https://www.sib.swiss/training/course/2019-07-elixir-sib-dc) François: yes, I agree. The checks about whether the metadata exist or not shouldn't depend on the scheme of the URL (because people could create a workshop website that doesn't use our template using GitHub). Could you please open an issue about this so we can address it in the future?

pbanaszkiewicz commented 5 years ago

Hello @Talishask. This is a peculiar behavior. The workshop in Potsdam/Berlin in its HTML website has no slug defined (see line 21 in http://swc-bb.gitext.gfz-potsdam.de/pages/2019-11-26-Potsdam-Berlin/). Because of that, we don't find this <meta> entry for slug. We rely on slug being available, because we check that tags are available with this slug.

I think there are two issues here.

1) The workshop website doesn't define slug. We relied on it, so it perhaps should not be empty? This may be error in workshop-template. 2) AMY relies of slug having some data, but I think it wouldn't hurt us to accept empty slug <meta> tag on workshop website.

Anyway, I think I can sort-of easily fix this bug (?) on AMY side.

fmichonneau commented 5 years ago

we'll have to think a little bit more about this I think. Depending on how critical it is to have the slug in the HTML metadata, we might have to require instructors to use GitHub. The workshop-template relies on the GitHub API (via the github-metadata jekyll plugin) to fill in the value of slug in the HTML. For this particular website, because they used GitLab, this information wasn't available.

pbanaszkiewicz commented 5 years ago

I leave it up to you, but from my point of view, adjusting AMY to accept [any/majority/most] of the expected tags from workshop website source is easier than rethinking workshop-template's internal design. The minor downside is that with "update from workshop URL" you won't get slug updated when it's not there.

maneesha commented 4 years ago

@pbanaszkiewicz The slug is a required field, correct? So there should never be a case where the slug is not in AMY.

pbanaszkiewicz commented 4 years ago

@maneesha yes, slug is mandatory. What I meant about "minor downside" is that you can use "update from workshop URL" to, for example, update workshop 2020-xxxx with workshop 2020-zzzz data by pointing to its website, but since slug <meta> tag won't be required, you won't change 2020-xxxx slug to 2020-zzzz.

Talishask commented 4 years ago

Revisiting this for these workshop submissions:

The information from this workshop website won't autofill our AMY event. Is this cause by the same hosting conflict as noted above?

pbanaszkiewicz commented 4 years ago

@Talishask: yes, in both cases they don't have slug filled out in the HTML source:

<!doctype html>
<html lang="en">
  <head>

    <meta name="slug" content="" />
    <meta name="startdate" content="2020-05-29" />
    <meta name="enddate" content="2020-06-01" />
    <meta name="humandate" content="May 29 (Portrait Room) / June 1 (Lecture Room D), 2020" />
    <meta name="country" content="us" />
    <meta name="venue" content="NIST" />

and

<!doctype html>
<html lang="en">
  <head>

    <meta name="slug" content="" />
    <meta name="startdate" content="2020-03-16" />
    <meta name="enddate" content="2020-03-17" />
    <meta name="humandate" content="Mar 16-17, 2020" />
    <meta name="country" content="us" />
    <meta name="venue" content="Tennessee Tech University" />