pyvideo / pyvideo

A Python media index
https://pyvideo.org
GNU General Public License v3.0
255 stars 81 forks source link

Think on how to ease pyvideo data correction #72

Open Daniel-at-github opened 7 years ago

Daniel-at-github commented 7 years ago

In pyvideo ETL process we have:

I think that the transform part have to be eased to attract new contributors.

Ideas:

description: | Speaker: Anthony Scopatz

Xonsh is general purpose shell that combines Python and the best features of Bash, zsh, and fish. Relying only the standard library and PLY, the xonsh language is a strict superset of Python that compiles to a Python AST. The shell provides exciting features such as a rich history, tab completion from bash and man pages, syntax highlighting, auto-suggestion, foreign-function aliases and more!

Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides

duration: 1837 id: 5067 language: eng recorded: '2016-05-31'

related_urls format is ...

related_urls:

willkg commented 7 years ago

When I was running PyVideo, we had people submit things in JSON format and generally, the blocker was bad tooling and not the file format. We did toss around switching to YAML, but didn't end up doing that because it didn't seem like a big enough win and it created different problems. We did add a better editor to Steve, but no one used it as far as I know except @codersquid and I. However, that was when PyVideo was based on Richard and we were mostly using JSON as a serialization format to import data into the db, so there are differences between this iteration and that one.

Even so, I think I'd spend some time finding out what's blocking people you want to be contributing to pyvideo. Are they eager to hand-edit files and are blocked by the file format? Are they blocked by better tooling and could care less about the file format?

If anyone is at PyCon US 2017, it's definitely worth doing a survey and talking to people in person there.

My intuition and past experiences suggest tooling will be the bigger win, but it's probably a bigger ongoing project. This is one of the reasons we spent time on Steve as a separate project--so people could fork and experiment and try multiple directions at the same time.

Hope that's helpful!

redapple commented 7 years ago

Was there a discussion on this at PyCon US 2017 in the end? Any minutes, conclusions or decisions? I'd definitely like to have YAML to fix reST descriptions when they fail to render (when I convert from HTML to .md to reST -- is there an HTML-to-rest conversion lib? I wasn't able to use pandoc for this for some reason). When I have to fix the description, what I currently do is:

If only for the literal string blocks using |, I'd switch to YAML for editing, or at least accepting the 2 file formats.

Daniel-at-github commented 7 years ago

Reviewing pull requests is more work that it should be:

Yaml eases the pain points of:

The way of the yaml conversion involves the tasks:

If we want to make a tool, instead of converting to yaml, IMHO it should have:

Any ideas for such a tool? what python tecnology? Any functionality in excess or missing?

zerok commented 7 years ago

If we are moving to YAML we also have to update the search engine 🙂