hotosm / old-export-tool

Version 1 of the Export Tool is suspended - please see V3 https://github.com/hotosm/osm-export-tool
16 stars 18 forks source link

Your own personal one-off download #106

Open harry-wood opened 9 years ago

harry-wood commented 9 years ago

Just realised my main design idea for HOT exports was buried in the closed issue #34 It's a bit of high level usage observation which influences the design of the whole system, so I'm sorry I should've fed this into the redesign work much earlier. But here it is...

We've got the idea of a job definition as a static thing which then has multiple runs (over time) stored against it. All of this is permalinkable, which is good news if you're interested in linking to downloads, or linking to a progression of downloads for a particular area. but...

I suspect a lot of people really just want to generate shapefiles as a one-off thing for their own personal one-off download. For this use case then it's wasteful to continue hosting the file at a permalinkable URL, or at all, and wasteful to store the job description. And it feels inflexible. Users often want to make minor tweaks and then run again, but to do that you have to spawn a whole new job description, which feels heavyweight.

A one-off download thing is more like the model Trimble Data Marketplace (formally "weogeo") have gone for, where you get an email back after your temporary download has been generated, and presumably it's purged from the system shortly after that. But I think we could actually try to design a hybrid approach.

So here goes:

I think it should always start off in the one-off mode, meaning it does the run, you get to see your download(s) via a temporary link "This link will continue to work for 48 hours". My reckoning is that this will be the end of the interaction for must uses. They'll download the file and they're done.

but then at this stage you are also given the option to "publish" the download results. This could be presented as a slightly unusual thing to do, and means we continue to host the files permanently, or for a much longer period.

Job descriptions would also be stored, and (for "published" ones) shown on a list somewhere, so at this point we might ask the user to name the job (and no need to ask them to do that before. They're just designing their own personal download)

But job descriptions are not a big disk space challenge, so we could store these for the temporary downloads too, making them casually available a bit like "recent documents" in a desktop application. A key thing though is to avoid cluttering up a public list of jobs with people's experimental temporary ones. As an individual user you'd have your own private list of ten most 'recent temporary jobs', any of which you can retrieve and re-open to modify or re-run. Or a simpler design might be to say only your single most recent temporary job is stored. It could be re-opened automatically as the defaults when you go to start filling in the form so you're always just making modifications to your previous job design. As an individual user, as well as having 'recent temporary jobs' you'd still want your 'my jobs' page which is a filter for a the jobs you have publicly published (for many users this might be empty)

The goal is to make the less frequently used features less prominent. Stuff to do with linkable jobs would get out of the way, and only be used by a few people. When they do use it, it'll be easier because the public jobs list will be less cluttered, and the whole system will be less wasteful of disk space (we're purging lots of files after just 48 hours)

Like I say, this influences the design of the whole system so ...yeah maybe one for version 3

bjohare commented 9 years ago

I think the option to 'publish' exports is a good one and could be easily implemented at this stage of the rewrite. As a compromise (given where we are) I'd suggest that published jobs could appear in the public list while unpublished jobs would appear (to the owner) for 48 hours until purged? We could provide a filter to allow users to view their own published and/or temporary jobs and provide expiry info for temporary jobs so they could choose to publish it before it gets removed.

I'm not sure about allowing users to modify and re run an existing job (rather than cloning it) as all previous runs for that job will then have a different configuration.. I think its too much at this stage to provide a kind of versioning system for export jobs..?? In a later iteration it should be possible to allow the user to create a job and 'save as default configuration' to a user profile which could then be used for subsequent jobs..??

Also, while we're on the subject.. in the current export tool, 'deleted' jobs are still shown in the export list.. this clutters up the list as well, and I'd be in favour of deleting jobs immediately...I don't see the point in keeping deleted jobs hanging around on the system..

Given where we are in the rewrite I'd also say its out of scope to allow users to create exports without naming them first..

mataharimhairi commented 9 years ago

I also think that 'publishing' exports for up to 48 hrs publicly is a good option. It allows easy access and use by individuals during activations and training events.

Correct me if I am wrong but it will be a link to the exported data that will be 'published', not that individuals will be re-running an export job.

There definitely has to be a time limit on it to reduce clutter and space. Similarly I don't think that 'deleted' export jobs should be sticking around, as this seems to defeat the purpose.

harry-wood commented 9 years ago

Well the "publish" terminology I'm suggesting here, would be more like... you get to download your files after they've been generated, but they're temporary and they'll be deleted after 48 hours unless you choose to "publish"

mataharimhairi commented 9 years ago

@harry-wood thanks for the clarification. we will indeed be incorporating this option in the redevelopment.