clearbold / artx

1 stars 0 forks source link

YAML Dump Details #1

Closed atosca closed 10 years ago

atosca commented 10 years ago

Mark,

I refactored this code over the weekend. A few issues I'm still working on:

Caching - implemented python requests-cache library. It's creating an sqlite cache, though I don't know if it's requesting from there first. Need to look into it a little more. Also for the 10 second refresh - how will this script be run?

Date Format - for pulling the date out of the "dates" field and formatting as MM-DD-YYYY for the head of the filename. Most, if not all, of the events look like exhibitions with a date range. The format of the date range field varies from "MAR5-OCT8" to "on view March 5, 2014 to October 8, 2014." Still trying to figure out how to catch every case and assuming a regex is the way to go. Let me know if you have suggestions.

YAML fomat - I'm using the pyyaml library, and the output contains different formatting and more markup than the Karen Clark posts. Does this matter to Jeckyll? Or would it make more sense to construct the posts by hand?

heymarkreeves commented 10 years ago

Hi, Angela!

Caching - implemented python requests-cache library. It's creating an sqlite cache, though I don't know if it's requesting from there first. Need to look into it a little more. Also for the 10 second refresh - how will this script be run?

The script will be run manually from the command line any time the ArtX team tells us there's fresh data. The cache shouldn't incur too much work. I was thinking of setting it up such that it would check the last refresh's timestamp and not hit the live URL if it wasn't more than a half-hour or hour ago, so that as you're developing and testing, you aren't constantly subject to lag. I think that's the only place it comes into play: Where can the cache help you work efficiently?

Date Format - for pulling the date out of the "dates" field and formatting as MM-DD-YYYY for the head of the filename. Most, if not all, of the events look like exhibitions with a date range. The format of the date range field varies from "MAR5-OCT8" to "on view March 5, 2014 to October 8, 2014." Still trying to figure out how to catch every case and assuming a regex is the way to go. Let me know if you have suggestions.

Let me pose this one back to them.

YAML fomat - I'm using the pyyaml library, and the output contains different formatting and more markup than the Karen Clark posts. Does this matter to Jeckyll? Or would it make more sense to construct the posts by hand?

Can these be committed and posted so I can take a look? We can clean them out later. As long as they're valid YAML, they shouldn't break anything -- Jekyll's YAML implementation should be able to support markup as well. My question would be: Is there a particular field that has lots of markup? Should that be the content of the post file? The part that appears below the --- in the KCC examples?

Thanks!

heymarkreeves commented 10 years ago

I did have a response from them today that most events are exhibitions, that they will be adding in a classification to distinguish exhibitions from events. Start date is still the important one for exhibitions, because we'll be loading those sorted by date on load. In the calendar & map, we'll request live JSON data from them to render on the fly.

heymarkreeves commented 10 years ago

Hi, Angela!

From the ArtX team:

Yes, we will be cleaning and standardizing the date values, but still need to do the cleanup and reformatting. The question of how to model it is also complicated...some events are recurring so simple "start" and "end" dates don't always work. But this may go beyond the scope of the MVP, and for now it is most likely that there will be a standardized date format that we'll clean on the server end, and will push a start_date and end_date up to you, assuming that this is all you'd need for the UI. I'll look to add these fields to the JSON endpoints tonight, though I wouldn't rely on the data that they present yet. Hope this helps!

For now, let's generate the posts files with the same date ("now()") and we can sort them ABC until we have a spec on those start & end times?

Thanks!

atosca commented 10 years ago

That's great news about the dates! I wish I'd thought to ask before spending too much time on them. But the work I did should make extracting the well-formatted dates a snap.

I committed example YAML files. Let me know how you think the format can be improved. There is a "description" field that could be used as the body of the post. (I also notice that there are some extra dashes in some of the names that I will take care of, and that I need to include that "type" field).

Locations and events are all lumped in the _posts folder with no distinction in the name. Is that ok, since they will be distinguished in the type field?

Caching is also working. I was thinking that something more complex needed to happen, but you're right - it was just a few minutes work and speeds up the script significantly.

heymarkreeves commented 10 years ago

Hi, Angela!

We have this now, too:

Mark-- the JSON endpoints should now have a "start_date", "end_date" and "event_type" field (with either "exhibition" or "event" in the field). While I haven't done significant testing on it, each should be generally working. Let me know if you have any trouble.

Mark