Closed Foorack closed 7 years ago
Thanks for making this issue @Foorack.
Important questions are, how many sites do we expect people to enter? Who are the different categories of users and what are their levels of trustworthiness and technical skill? Who is going to officially own and be responsible for this tool?
Anything that allows unrestricted write access to the general internet is a challenge, and anything that allows the general internet to write something and have it automatically appear somewhere else like a status page is even worse. At a minimum the tool needs one free-form text field to submit a domain, and another field to submit optional notes about that domain ("This site behaves differently if you access it from outside the U.S....", that sort of thing.) So, the back-end needs to:
I don't think there should be a status page, but if we do have one, then we also need to put the entries into a hidden queue for a trusted person to approve before they appear on that page.
So most of the work is back-end data processing. I'm not experienced with Google Forms but my guess is it would save us a little front-end work in creating a form but would not help at all on the back-end and might even complicate things.
With the Twitter approach, the problem becomes almost entirely back-end since there is no form.
My feeling is this tool should be a small but complete and standalone web application.
Anything that allows unrestricted write access to the general internet is a challenge, and anything that allows the general internet to write something and have it automatically appear somewhere else like a status page is even worse.
Eh, we do have an unrestricted internet. Fortunately. At least in most places. At least when we exclude net neutrality issu... I get too far. So what do you mean by "internet"? You should specify this term a bit more. Also we can't write to the "general internet". Don't know what a "specific internet" would be, but, well... I think these sentences confuse me.
At a minimum the tool needs one free-form text field to submit a domain, and another field to submit optional notes about that domain ("This site behaves differently if you access it from outside the U.S....", that sort of thing.)
Okay, getting back to sentences I understand. So yes, that's it. Although a check-boxes for common issues (mixed content, ...) would be useful. In this case, however, we would have to explain users what mixed content is. That's quite difficult... However this could be circumvented by letting HTTPSE check by themself whether it is Mixed Content or by doing a temporary ruleset and ask the user again after - say - one week: "Could you use XY.com successfully?" Basically this would be some interactive way to submit domains and they are usable for non tech-savvy users.
make sure the submitted domain is a valid domain in the RFC meaning of "valid"
I'm sure some kind of RegExp will do this. However we could also solve this issue on the client site (when we use the "interactive submission" idea again): Users can only submit the domain they are currently using. And if browsers can connect to these ones - and HTTPS could e.g. somehow check the IP to make sure it is no internal LAN ip - they are allowed to submit it.
I think all the other things you mentioned can also be checked on the client side with the interactive submission.
I'm not experienced with Google Forms
I am sure we don't need this. It's a simple UI... I am also sure EFF does not want to "expose" their users to Google. :smile:
My feeling is this tool should be a small but complete and standalone web application.
My feeling is it should be included in the HTTPSE addon and maybe it could ask the user when they visit a HTTPS: "Hey, you visited example.com for more than 100 times. We have not included this into our HTTPS list yet. Do you want to help us by submitting this domain?"
BTW there is already a backend, which does some mixed content testing and so on: https://github.com/EFForg/https-everywhere/issues/1192
Also https://github.com/EFForg/httpse-ruleset-tests could be somehow integrated into HTTPSE. And I thought there was an issue for this task, but I cannot find it right now.
I like the idea of integrating it into the extension and allowing users to report/submit domains.
and ask the user again after - say - one week: "Could you use XY.com successfully?"
Though I'm not that big fan of HTTPSE breaking websites, especially for non-techy users as they will just remove the extension completely. Even if we implement it into the extension instead of having a web UI there is still the need for a backend system collecting and testing the domains. I would be really good if the backend created issues automatically. :)
HTTPSE breaking websites, especially for non-techy users as they will just remove the extension completely.
Yeah, but in my idea the user has to opt-in explicitly and there could even be a "This site is broken!" button or something shown, which reminds the user that something is not working. Aso - in an ideal case - HTTPE could detect Mixed content issues by itself and stop the experiment or at least ask the user whether the site is okay.
Even if we implement it into the extension instead of having a web UI there is still the need for a backend system collecting and testing the domains.
Yeah, at least for the bad guys using bots to submit invalid domains and so on...
I would be really good if the backend created issues automatically.
No, it would be good if it opens PRs automatically. :smiley:
As suggested by @jeremyn in https://github.com/EFForg/https-everywhere/issues/3069#issuecomment-239867644 the bot could also parse the Certificate Transparency list, but in this case the domain needs to be checked on the server side again.
I'm sorry my general/specific internet phrasing was confusing. "General" means anonymous users, "specific" means people with some EFF trust.
I meant that we want to protect EFF from being DOS'd or embarrassed by malicious domain submitters. We need to ask how a hostile government or criminal organization with a large botnet could use this tool to overwhelm EFF's systems or volunteer resources. If it allows for people to submit endless variations of dc38cea7-cff6-40d0-85a0-2876e08d9259.com then we need to plan for that. If public users can browse the status of all submitted domains in a list, then we want to prevent users from submitting endless variations of fake "www.eff-sucksssssss.com" domains or fake domains with curse words, slurs and so on. Unfortunately I think some sort of captcha would be required.
Client-side validation is fine in addition to server-side validation but shouldn't replace server-side validation.
We should assume non-malicious users of this tool have no idea how to express what's wrong with the site other than "both http and https work, help!" The optional note field is for the user to tell us anything weird about the site they happen to know about.
I like the idea of putting this tool directly into the add-on and/or allowing people to submit the site they're currently browsing.
My hope for this issue is to let users easily report a site so some technical volunteer can look into it as time permits. Automating creating rulesets, issues or pull requests is a whole extra layer of work and maintenance that I think most people involved would eventually regret doing.
Unfortunately I think some sort of captcha would be required.
That's a good idea. This would also make it harder for people wanting to spam and would certainly discourage bots.
Also the HTTPSE server could require a submission of a current state for seven days. This may be the time the user has to test the domain (& the temporary ruleset) by themself before the request is even published in the status page/add-on page.
dc38cea7-cff6-40d0-85a0-2876e08d9259.com
A server could just test whether it is pingable/curl succeeds. And again: Users might only be able to submit a domain when they are currently browsing it. This also means that "www.eff-sucksssssss.com" is impossible unless they really register this domain. (in which case it could again be - technically - valid for HTTPSE inclusion)
Client-side validation is fine in addition to server-side validation but shouldn't replace server-side validation.
Yeah, of course. I just think the client side "validation" (with the "test period" mentioned) could filter out many broken websites and so on. It only matters to non-malicious users and prevents them from submitting bad entries.
Automating creating rulesets, issues or pull requests is a whole extra layer of work and maintenance that I think most people involved would eventually regret doing.
I think compared with what we already proposed here, it is not that difficult. It just needs a bot user interacting with GitHubs API.
Sorry, what does this mean? "Also the HTTPSE server could require a submission of a current state for seven days."
I don't want this to require, encourage, or even allow users to provide a point of contact. People might feel uncomfortable if they think it's not anonymous. They might be nervous about some programmer contacting them with technical questions. They might be nervous about talking with someone in English. They might just prefer not to be bothered. On the other end of it, they might expect personal follow-ups. They might send emails to EFF asking for a list of domains they submitted, like they have some kind of account with EFF. Law enforcement might contact EFF to get the email address of someone who submitted a controversial domain.
I agree that client-side validation to reduce user error and ease work on EFF server-side is fine.
I'm inclined to leave automated issues/pull requests as a possible "phase 2" for this project, after the basic infrastructure of collecting domains and making them visible to volunteers is figured out.
Sorry, what does this mean? "Also the HTTPSE server could require a submission of a current state for seven days."
Okay, imagine this:
domain.example: 0 mixed content, 0 times user clicked on "this does not work", 12 sites visited
) or it just sends the domain to the HTTPSE server (domain.example: test phase day 1 of 7
). The server keeps track of the days and will only allow a final submission once 7 days are passed. Additionally it could verify whether the data of the days, which has been send, is logically. If not it rejects the request.
Also the server could further require additional tests and informs the client to make the test phase longer. E.g. less than 100 sites visited --> user needs to test more sites
.The requirement to submit the data each day and to make the data looking realistic and to react on different replies of the server, could make it potentially harder to fake these submissions.
Of course this whole thing is only imagination. I have not taken into account how it is exactly possible or how difficult it is to implement this. I am just throwing ideas into this issue. :laughing:
As for your contact thing: As explained in the lines above, I think this is a good thing. But of course it should be optional. With all these tests involved, there (hopefully) should not be the need to contact the author of the domain submission, so it can be anonymous for those who want to say anonymous. In this case all users fearing to get technical questions, might just not enter their GitHub username. (They would not even be registered on GitHub, so that's fine.)
On the other end of it, they might expect personal follow-ups. They might send emails to EFF asking for a list of domains they submitted
In my example an optional GitHub username submission would make this obsolete.
Law enforcement might contact EFF to get the email address of someone who submitted a controversial domain.
This would be obsolete too as all information is public anyway. It's on GitHub. (They might contact GitHub to get IPs, but this does not matter for HTTPSE here.)
I'm inclined to leave automated issues/pull requests as a possible "phase 2" for this project, after the basic infrastructure of collecting domains and making them visible to volunteers is figured out.
Structure it as you want. Split the project as you want. Implement it or do not implement it. As I said I am just submitting some ideas here, which would be great both from a user and developer (= here: ruleset maintainer) perspective. I'd just say one thing: If this would be there already, I think it would be an awesome thing.
I'm just throwing out ideas too. I don't know if EFF even likes this idea or would sponsor it. I may be willing to do some coding for it if the EFF says they will use it and can provide specifications.
Imagine your least technical friend. This person notices that sometimes there's a lock when they visit their (e.g.) hometown bank's website and sometimes there isn't. Their genius friend @rugk put this "S" thing into their internet program that's supposed to make sure the lock is always there. In my view, ideally this user can somehow submit their bank's website for review through this tool we're discussing without being intimidated by the entry form. If it mentions GitHub credentials or there's any hint that some stranger from the internet might contact them, they won't do it.
Also, allowing for the possibility of unexpected contact from EFF opens the door to phishing attacks, for example "You recently submitted $POPULAR_BANK.com to the EFF for review. We're having trouble with it. Please provide your account id and password." etc.
Yet another danger is maliciously submitting a website in someone else's name. I submit $DISGUSTING_SITE.com under your name and then you get an email asking about. Basically if we take email addresses then we need to verify that the person submitting it actually controls it.
Basically, I don't think we should accept user contact info. In fact we probably shouldn't accept anything but the domain field, not even optional notes. Maybe we only let them submit the domain they are currently browsing with no typing involved.
I really don't like the idea of collecting browsing history in the add-on for any reason. I think we can verify the domain is legitimate by a few server-side automated requests and maybe a DNS lookup, in addition to any client-side regex-type checking we think is worth doing.
I really don't like the idea of collecting browsing history in the add-on for any reason.
It is not collected. It is only done locally of course.
Our current problem is not currently lack of coverage (although clearly the more coverage the better), but rather scalability. Unfortunately, the desirability for better coverage is at ends with the scalability issue.
The number of domains that have deployed HTTPS has increased, and as a result we're seeing a greater number of ruleset PRs. These PRs generally need a human to review them. We can generally assume that for trivial rulesets (ones for which the rule
is just to redirect from http
to https
) there is no malicious intent. Even PRs without malicious intent, though, have to be spot-checked by ruleset maintainers to make sure there is no subtle way that functionality is lost. So there's a bottleneck between the point of PR submission and inclusion.
There is also the problem of scalability in terms of memory consumption and download size. We did some memory profiling and saw that a long-running HTTPSE instance on Chrome takes up about 53mb, and the extension download for Firefox is 2.7mb, for Chrome 1.7. This will also get higher as more rulesets come rolling in. This probably won't be a huge issue unless we have some automated submission & inclusion system, but we do want to keep both the memory profile and download size low.
So in short: Too much domains also would not be good for HTTPSE.
Hmm, maybe HTTPS by default is coming nearer. :smiley:
Regarding scalability, one could combine it with an "alexa ranking test" so we secure not every private blog. Would keep memory low.
Mabye not entirely automated and with a small entry barrier to only go to the slightly more advanced people, one could do a default-off #2718 , no automated submission. We wouldn't be overwhelmed with "hey check this" but people that are intrested and notice something could still help. If a site really is broken and they care , my missing $ on heise.de for example, people will come over here. I do not think we need an automated complain management ;)
More important is growing the test bench here i would say. Find Rules that can be purged or are broken or some that are disabled and could be deleted or re-instated because the cert is valid again or so. Maintenance is difficult already
I've been doing ruleset maintenance for a little while now. My updated opinion on this issue is that I really want at least semi-automated ruleset creation and testing tools before we open the gates up to a flood of new requests. Pull request #6857 is a good example of the sort of effort potentially needed to properly handle even a simple "please add this school" issue. At the moment even the Good Volunteer Task
-labeled issues aren't getting done, let alone the hundreds of other issues. I'm not particularly eager to invite thousands of additional requests through Twitter etc.
Hey! So as discussed over email with @Hainish I'm planning to work on a tool to help with this. Here's a rough MVP proposal:
Once that MVP is done (which I really don't think will take very long - a couple full work days, maybe a week, including time to write the test suite, etc.) I can imagine some additional features:
<test>
tag creation based on crawling the siteI prefer to work in Node.js/Express and test things with Perjury, which is a better implementation of Vows, but if people really seriously hate either of those things I could do something else.
I'm not sure what the proposed web application is for. This issue was to discuss making it easier for users to submit domains that need a ruleset, essentially solving the problem that the catch-all issue #3069 does. But
- Builds a prioritized queue of ruleset submissions to review from open PRs and ordered based on Alexa rankings
- Basically all the UI does is present "this is the next ruleset you should review" based on the queue ordering/prioritization
sounds like the proposed web app is supposed to help me, as a reviewer, find pull requests to review. Is that right? If so, I don't want it, thanks anyway. I can use GitHub to find stuff to work on, if I want to.
For
- Semiautomated ruleset testing/
tag creation based on crawling the site
we already have five separate projects working toward automated ruleset creation. I think thoughts on that topic should go in that issue.
Looking through this proposed web application again, it seems like this differs from what @strugee and I discussed in the way that @jeremyn points out.
The intention of this webapp is to automate prioritization of requested rulesets, not ones that already have a PR and have been submitted.
This means that users (perhaps logged in through github, as you have in your post-MVP) should be able to submit new sites they wish to have coverage for. The application would automatically prioritize these. If a site already has coverage and needs additional work or needs to be reviewed because it's become stale, we should incorporate that into the logic, too.
Sorry for the confusion!
@strugee as far as the language you wish to implement it in, that's up to you. Node is a perfectly good choice and many of the helper applications in utils/
are written in node. Eventually it would be nice to port all the python code over to node as well, so we can have codebase that requires knowledge of only one language, which will help with long-term maintainability.
@Hainish Who is this prioritization for, meaning, who is consuming this prioritization information? People who want to work on high-priority rulesets? With the existing tags, they can just make bookmarks like this:
@jeremyn there's absolutely no standardization in the way people submit domains they wish to see coverage for. One person opens an issue: "Please add coverage for example.com" and another persons issue says "Coverage for Example domain." This is a problem, it means we can't write a script to auto-label for ruleset requests, as we have for PRs. The only reason it works for PRs is because we have code we can parse to determine prioritization. So just using the existing tags in GitHub isn't enough, we need an automation system where users solicit coverage in a standardized way. You yourself have complained in the past about having a separate issue for every coverage request. It gets unwieldy quickly.
One idea comes to mind. Having a separate repository for rulesets is a good long-term goal, and perhaps we can move requests for coverage into such a repo before the actual rulesets are separated out. This will neatly separate core codebase issues from ruleset issues. Another thing that may help is creating an issue template which explains that if you're requesting a new ruleset, enter the domains requested in an easily parsable manner (the formatting of which we can explain). How does this sound @strugee, @jeremyn?
@Hainish I understand better what you're looking for now: a way to auto-prioritize domains that people enter, so people can tell us "Please add coverage for example.com" and then example.com shows up on a list somewhere with its Alexa ranking attached. Now that I understand it, I agree with that goal.
This is really just a survey with one question: what domain do you want to let us know about? We can probably use Google Forms for this: intake on a form, do basic cleanup like removing leading and trailing whitespace, and add it to a Google spreadsheet while joining it to the Alexa data stored in another Google spreadsheet.
I'm not sure whether splitting the rulesets off into another repository is overall good or bad, but since I rarely deal with the non-ruleset code my opinion may not be relevant. I don't think it matters much for the specific problem of streamlining receiving requests from users, though.
All right, just don't use such a proprietary service as Google Forms. You already noticed it is easy to do and this is an EFF project after all…
A basic unanswered question here is how much time and effort the EFF is willing to put toward maintaining any solution day-to-day. For example if the people the EFF is willing to allocate aren't technical enough to maintain a SQL database, then we have to rule out any solution that involves a SQL database regardless of the technical merits. If they aren't willing to plan for or react to being DDOS'd then we have to rule out self-hosting. If they aren't willing to periodically review and prune user input for offensive or illegal values, then we can't make user input public. Etc.
It's like we collectively are consultants and the EFF is our client. If the EFF can't provide a maintenance budget then there's no point in planning to do the work, other than as an intellectual exercise.
Hmm, I must have misread your email, @Hainish.
Here's a revised MVP flow (which again I don't think will take very long to implement):
Already there we have the ability to take in submission requests in a structured way that splits codebase issues from ruleset issues. And as a bonus, to @jeremyn's point - from an implementation perspective it's completely stateless so there isn't a lot of maintenance burden.
Some possible additional features after the MVP:
Obviously this is closely related to having a separate repo for rulesets; I've put some thoughts on that here: https://github.com/EFForg/https-everywhere/issues/2697#issuecomment-303036469
I'm really leaning towards the third option discussed in that comment.
If we are restricting the ability to submit new domains just to people who have GitHub accounts, then we can accomplish the goal by making the separate ruleset repository and adding an issue template. We don't need an application for that. @Hainish 's existing Alexa script can handle Alexa autolabeling.
It would be more useful if anonymous, less-technical people without GitHub access could submit domains, perhaps through a single text input field form on or under https://eff.org/https-everywhere. However that introduces the various concerns I described in my previous comments.
@Hainish, I wonder what you think of https://github.com/EFForg/https-everywhere/issues/6322#issuecomment-303191656? @jeremyn has some good points; the less code we write the better - and I definitely keep falling into the trap of overengineering things in this thread.
Seems like maybe the best way to approach this is to start with an issue template and implement features I've listed above that don't fit into the template with a bot, which would basically just add some additional automatically-determined information to each issue.
We could use the same bot to allow anonymous submission, as Jeremy suggested. Probably with some sanity-checking to make sure the submission isn't spam.
@Hainish ping?
@strugee I'm honestly worried that even with an issue template, people will incorrectly format the issues they submit. For instance, if we have a template like this:
Please enter the hosts you wish to see coverage for in the following format: Hosts: www.example.com, example.com
I could easily see this submitted:
Hosts www.example.com & example.com
This is because people might not understand that strict formatting is necessary for their issue to be prioritized properly.
There are two ways to handle this.
One is to create the webapp as you originally intended, ensuring the domains are submitted with the correct formatting by enforcing this when the user is actually submitting the request. This then is made into an issue on GitHub. The benefit of this is immediate feedback. The drawbacks are that you have to write more code to provide this interface, and think about user logins and perhaps anonymous submissions.
The second way is to have a bot that just looks over the latest open issues (in a similar manner that I look over just the latest open PRs in hsts-prune
) and if an issue is improperly formatted, add a comment that states this, and close the issue.
I kind of think a submission portal for domains is nicer, because of the immediate feedback it provides, and it also seems way more intuitive from the perspective of a submitter. I'd suggest doing away with user logins and just having a single, new user that we create submitting issues once the details are gathered. Many people (including myself) are not keen on giving a third party application permission to post with our GitHub account. Without logins, it's simpler, and we can at the end of the workflow provide a link to the GitHub issue so if someone wants, they can log in themselves and provide additional comments or see how the progress on this is going. We'd have to have some spam-prevention mechanism that isn't using 3rd party includes such as recaptcha
, but there are privacy-friendly alternatives out there.
To summarize, my preferred app looks like the following:
domain
(which is regex-checked for formatting errors), relationship to domain
which could be a drop-down like owner
, webmaster
, other
, and perhaps something like notes
How does this all sound?
I think it would be worth making the new repository, adding the issue template, and running the autolabeler for a while to see what happens, before anyone makes a web app. After some time we can estimate how much reviewer time was spent labeling poorly written issues. My guess is that a web app will cost 50-100 hours of dev/maintenance time over the next year to save about 15 minutes a month of manual work.
@jeremyn the problem there is that you don't know what submitters you've lost due to lack of easy/anonymous access (e.g. no GitHub account).
If it's coded as a regular issue-scanning script rather than a webapp, a lot of the code from the hsts-labeller
can be reused. For instance, the scanning part, and also the labelling part. The only part that would have to be coded is the format-parsing, commenting, and closing if it's of a non-matching format.
@jeremyn you've convinced me, I think it's fine to code as an issue-scanning app. This also wouldn't require standing up a web server, which requires greater resources and more deployment time from our internal TechOps team at EFF.
@strugee do you have a clear idea of how this might be implemented, given the discussion?
Please don't autoclose misformatted issues. Some projects do that and it is so annoying. We don't have the volume to justify it.
People do make throwaway GitHub accounts to anonymously contact us. You could also gauge interest in anonymous submission by setting up a special @eff.org email address where you take domain requests, and encourage people to send the requests from an anonymous account. You can then manually create issues for those domains. If there's big interest then that argues in favor of a web app or some other heavy duty approach.
@jeremyn the problem I see is that if a malformatted issue is not autoclosed, it will linger in the repo issues without ever getting labelled. We could, for instance, label all malformatted issues as malformatted
, but this causes extra work for maintainers (either to close that issue or open a new one where the formatting is correct.)
I think it's appropriate to close with a polite comment, especially if we have well defined guidelines in the issue template.
Autoclosing an issue with a bot is like the harshest thing you can do to an issue, regardless of how nice the form comment the bot adds is. It is just not worth doing that here.
Ansible is a good example of a repository that uses a bot (@ansibot
). Keep in mind the Ansible repository probably gets ten times more traffic than we do and their issues are much more complex. Here are closed issues where ansibot has commented. In https://github.com/ansible/ansible/issues/24982#issuecomment-303737725
you can see ansibot adding a comment asking for more info but it does not close the issue. When an issue needs to be closed, a person gets involved, for example see https://github.com/ansible/ansible/issues/24956#issuecomment-303555936
.
By the way, the ansibot code is publicly available here.
If this is in a separate repo which has formatting guidelines on opening issues, I see absolutely no problem with closing issues which don't follow these guidelines. This avoids cluttering our issue queue with unparsable issues which will never be addressed.
Unparseable doesn't mean unreadable to a human. Just give it a needs_info
tag like ansibot does and move on. You can filter these issues with -label:needs_info
when searching if you want.
To be clear, both of our positions are defensible. The thing is that in my opinion the by far most serious need HTTPS Everywhere has is getting more contributors and especially more reviewers. So when a choice has to be made, I prefer approaches that are friendlier to new contributors in the hopes they'll stick around. Some projects are flooded with contributors and bad issues and for those projects reducing the noise is more important than attracting new contributors. I don't think that's HTTPS Everywhere, though.
Hey, sorry I haven't replied to this yet. I have a pretty good idea of how this should look.
Re: closing, let's punt on it for now. It's super easy to change later anyway. My guess is that if we add a note in the issue template explicitly saying, "don't change the formatting, this is read by a bot," people will do a better job. We can also improve the bot so it gets better at dealing with common formatting problems. I can also write in functionality where you can ping the bot to reparse an issue once formatting is fixed.
So, I think the next action item is to create a repo, right? @Hainish could you make that happen, and give me admin (or just write) access to it? Also, do we want the bot to have a separate repo? I vote yes; I think one of the biggest benefits to splitting out the rules is that you can clone the repo and get just the rules with no code. (Unless we wanted the bot to sit in this repo.)
@strugee You can get started developing the bot code in a separate repository that you own. When it's ready or at least near completion, the code can be transferred to the EFF if they think that's appropriate.
For development you might want to automate creating a small issue repository with each issue as a single test case. Also, test mocks for the GitHub API seems to be a well-traveled path though I can't give personal recommendations. In any case running the bot against an EFF-owned repository should come at the end of the development, and the bot should probably be owned by either @Hainish , or @Hainish can create a new bot-only user and run the bot under that.
I was going to respond basically with what @jeremyn said exactly. When it's ready to be handed off, we'll simply have to create a new GitHub API key, repo, and issue template for that repo.
@strugee one thing I've been doing recently for these standalone tools is putting them in the utils/
folder within HTTPS Everywhere. This keeps all the associated utilities for the project in one place, which I consider a bit neater. Also, I've been creating Dockerfile
s for easy deployment. If you're familiar with docker and are so inclined, by all means follow suit. Otherwise I'll probably just dockerize it after you hand it off.
As far as deployment goes, I'll most likely just stand this up on the same server as we host the labeller.
In any case running the bot against an EFF-owned repository should come at the end of the development
Oh yeah, that was always the plan. Wouldn't want to spam the tracker :)
@strugee one thing I've been doing recently for these standalone tools is putting them in the utils/ folder within HTTPS Everywhere. This keeps all the associated utilities for the project in one place, which I consider a bit neater.
Cool. So I'll develop this in utils/
in a branch and send a PR then?
Also, I've been creating Dockerfiles for easy deployment. If you're familiar with docker and are so inclined, by all means follow suit.
I'm not but I've been meaning to learn. So I might try my hand at writing a Dockerfile anyway.
Ping @Hainish
@strugee yes, develop in utils/
and send a PR please
Just as an FYI, I'm still actively working on this :)
Taking a while because of a bunch of mocks and stuff that need to be written for the tests. But after all that's done it should be really easy to develop and test without setting up GitHub and stuff.
Thanks @strugee
@Hainish Can we auto-close the issues if they are not corrected for a week after they were first posted?
I'm in favor of this approach, but @jeremyn had strong opinions against auto-closing so I'll let him argue the point.
I've already argued that position in this issue.
Credits for this idea goes to @jeremyn
The idea was originally posted in https://github.com/EFForg/https-everywhere/issues/6307 Moving it here to keep the other issue clean from off-topic discussion.
The discussion left off with me saying: