backdrop-contrib / backdrop_upgrade_status

Checks to see if the installed modules on a Drupal 7 site are ready for upgrade to Backdrop CMS.
1 stars 5 forks source link

Add telemetry to Backdrop Upgrade Status #6

Open philsward opened 3 years ago

philsward commented 3 years ago

Forgive me if this is already available, I haven't tried this module out yet.

The idea, is to have a link or button or maybe even just send upon being enabled, a mechanism where the site submits the list of modules used on Drupal 7.

Take this information and aggregate it over time so a list can be compiled of what modules need ported that will produce the least friction for the majority of sites to migrate.

Instead of guessing, or instead of waiting for people to request it, the dev team can choose to just start working on some of these ports based on the data.

I realize a good majority of modules have either been ported or included in core, but there might be a handful of outliers that are unknown and this would give a good snapshot of real-world usage.

I think I would assign a unique number to each site (for privacy concerns), enable it by default to work in the background and submit weekly via Cron.

indigoxela commented 3 years ago

@philsward collecting data without the knowledge of the website owner doesn't seem like the best idea and will be impossible/illegal in Europe.

I would assign a unique number to each site (for privacy concerns), enable it by default to work in the background and submit weekly via Cron.

:fearful: I can't describe how creepy that appears to me.

philsward commented 3 years ago

@indigoxela noted, configure it in a way where the information can be sent in a legal, non-creepy way, but it needs to be designed for high participation.

A manual submission is fine. On the backdrop status check page talk about submitting the list to backdrop and why it's important.

Add a setting that is enabled by default for "Show a message to superadmin user whenever modules have changed that asks for a resubmission to backdrop".

On the settings page, explain what is sent, how it is sent, how it is saved etc.

The whole point of this is two-fold: 1) encourage participation from non-backdrop users

2) empower the team with data that can help with decision making and priority

stpaultim commented 3 years ago

I like this idea a lot. I think it's clear that the Backdrop CMS community has been very clear in our commitment to being very conservative about what data we collect and doing it in a very open and transparent way. This has been discussed extensively with the telemetry initiative.

In theory, there is nothing in this proposal that is different from what we hope to do, in terms of data collection, with Backdrop CMS sites. But, in this case we would be collecting data (transparently and with permission) about the usage of Drupal 7 sites considering the move to Backdrop CMS, with user permission and in a responsible way.

If we collected this data from even 20-30 sites it could reveal some interesting gaps in our contrib coverage that might be effecting which people choose to migrate and which do not.

Once we have the telemetry project for Backdrop sites working, I think that adding something similar to this module might be relatively easy. So, for now, progress on the telemetry initiative is a blocker for this idea.

philsward commented 3 years ago

😨 I can't describe how creepy that appears to me.

@indigoxela the unique number idea is to mask the identity of the originating site. Upon install, a unique number is created for that site. When the data is submitted, it submits the module list along with the unique id. The team can see the id, but have no way of knowing which site is tied to that id.

Without the unique id, it makes aggregation a bit more difficult especially if we want re-submissions from the same site later. For example, a person could spam the resubmission link and generate a bunch of false positives. There has to be a way to say: "They voted 20 times, but only one vote counts."

I'll agree @stpaultim this can wait until the telemetry piece is a bit further along.

Update: looking back, I'm honestly not sure which part was referred to as creepy... I'm now thinking the "enable by default" statement which I addressed in a previous comment.

indigoxela commented 3 years ago

I'm still not sure if the backdrop_upgrade_status module is the right place for telemetry. I wouldn't expect that sort of behavior when installing a little helper tool.

I'm honestly not sure which part was referred to as creepy

Automatically sending data via cron without interaction of the admin. :wink:

philsward commented 3 years ago

Items to agree on in order to move forward:

stpaultim commented 3 years ago

I'm still not sure if the backdrop_upgrade_status module is the right place for telemetry. I wouldn't expect that sort of behavior when installing a little helper tool.

In my view, we would only do this if it was done in a completely transparent and open way. I don't think it would happen automatically, I suspect it would be an opt-in feature. When a user is configuring the module, they would be asked to share an anonymous list of the Drupal modules that they are using for the purposes of helping the Backdrop Community port the most requested modules.

It would be a very straightforward and transparent request and in the interest of users to share the data. But, fully optional and opt-in (in my opinion).

I love this idea in theory. My reservations are:

1) How much work is it? 2) Do we have the resources to actually use this data? 3) What is the status of the upgrade module? Do we have buy-in from the maintainer? Are people actually using the upgrade module?

I'm reluctant to spend the resources to collect this data, unless there is some likelihood that we will actually have the time to process and use the data collected in a constructive way. It's not as if we have the ability to assign developers the task of porting the most needed modules. Also, as @philsward has pointed out, this data will only be useful during a time frame of 1-2 years.

I think it's useful to consider whether or not our current telemetry initiative has the capacity to collect this data and store it, in order to keep our options open.

But, upon some reflection, I agree that it might make sense to keep this project completely separate from the current core telemetry initiative. I'm still not sure on the technical considerations.

Note, I've never actually used the upgrade module - so, I'm not sure how adding to this process will actually work. Personally, I don't see myself spending much more time on this issue until after the release of 1.18, because I'm swamped at the moment and while I agree with the intent of this idea, it's not quite at the top of my personal priority list right now (but, I could be convinced that it is worth it).

stpaultim commented 3 years ago

I'm going to write a bit more. It's always good to ask, what is the problem we are trying to solve and is this the best solution (does it provide the best return on our investment)?

The ultimate goal is to make sure that Backdrop CMS has the contrib modules that most D7 site owners are looking for when they evaluate BackdropCMS.

What are other solutions: 1) Doing better outreach and recruitment to potential contrib porters/maintainers. 2) Providing more resources and support for those who are trying to port modules to generally increase the scope of what is available. 3) Asking new folks to the community to fill out a survey that asks them what features and modules that they are looking for? 4) Creating some kind of reward (T-Shirts or Swag) for anyone that ports a module

My point being, that while I like this idea. It's not yet clear to me that it's the best solution for the problem we're trying to address. Now, if we had a volunteer that was eager to write the code and make it happen, that would be less of a consideration.

philsward commented 3 years ago

I love this idea in theory. My reservations are: How much work is it?

The work involved will include two pieces: 1) The collection site (where the data is sent and aggregated) 2) Adding the necessary code to the Backdrop Upgrade Status module

Do we have the resources to actually use this data?

Good question. I think this will answer itself if we can get this idea in front of the right folks who see the vision of what it can accomplish long term.

What is the status of the upgrade module? Do we have buy-in from the maintainer? Are people actually using the upgrade module?

The first two I can't answer, but whether people are actually using it is a great question to ask and while I don't know the actual answer, I am very confident in saying that there aren't many people using it.

The reason, is because it's not on Drupal.org which I am also pushing to have done. However, before we get this added to Drupal.org I would like to see the telemetry added into the module so "from day one" we can begin collecting data on the module usage of Drupal 7 sites.

What are other solutions:

  1. Doing better outreach and recruitment to potential contrib porters/maintainers.

By adding this module to Drupal.org, we are getting directly in front of the very people who we are trying to recruit. By asking them to opt-in to offering their data, we are "from the beginning", asking them to contribute back to the Backdrop community to make it better. They wouldn't install the module if they haven't at least considered Backdrop as an alternative solution. By asking for their data, we are telling them they can help without actually being active yet.

  1. Providing more resources and support for those who are trying to port modules to generally increase the scope of what is available.

Right. If a Drupal 7 user installs BUS on their D7 site and we have a link from the BUS status page to the data that is being aggregated, and they see a list of the top modules that still need ported and some of those are used by that user, it might encourage them to jump in and port them. In other words, let's encourage D7 folks to be more active in the upgrade process of Backdrop by giving them the data they need at their fingertips. If we can show a user that 5,000 other sites will benefit from the port of a module and they themselves are using that module, it might give them more reason to jump in and help port it.

  1. Asking new folks to the community to fill out a survey that asks them what features and modules that they are looking for?

While this is an option, the userbase for Backdrop is very small. 1500 vs 100,000+. We can get A LOT more data from a modest 10% of Drupal 7 sites than we can from the current Backdrop community. (If we can get Backdrop Upgrade Status installed on 10,000 Drupal 7 sites, I would feel very accomplished with this project)

  1. Creating some kind of reward (T-Shirts or Swag) for anyone that ports a module

An option worth exploring

My point being, that while I like this idea. It's not yet clear to me that it's the best solution for the problem we're trying to address.

Let me respond with a question: "Is it easier to convince a Drupal 7 user to use BackdropCMS or a Wordpress user to use BackdropCMS?". The solution I'm proposing is to get in front of a very large userbase with a very low barrier to transition.

In the few weeks I've been active in the community, I've observed that the overall community wants more people and more contributors, but struggles to attract them. I'm proposing a very pointed, focused and effective method to do exactly that. It won't happen over night, but the more people who realize their sites are close to migration, the more reason there is for them to begin building new sites with Backdrop (If they haven't already) and get involved.


What I'm trying to accomplish with this is to be intentional about getting in front of a target audience of people who are currently in limbo on where to go next: D9+? Wordpress? Backdrop? Something else?

These folks are in a position to be VERY receptive to Backdrop CMS as a long-term solution but we can't ever speak to them if we aren't actively in front of them. This goal is to increase the outreach to BackdropCMS through a marketing channel that is extremely receptive to suggestion that BackdropCMS will be their next tool.

OK, so how do we convince 25,000 people that BackdropCMS is a great "next move" for them? Gain their trust. In order to do that, Backdrop needs to show that:

1) It is stable.

2) It is familiar

3) It has low overhead/barriers to adoption

4) It is around for the long haul

This is also another topic for another discussion, however, if we can use the BUS module to subtly be in front of D7 users by showing a monthly admin site message status of "There were X more modules added to Backdrop CMS since last month and Y of them you use. [link] Check the status of your site [/link]. This is a wishful idea at this point, but shows how we can be pro-active at being in front of our target audience and show them "Hey, things are moving forward. BackdropCMS isn't going anywhere".


Who is this for? The idea behind this issue is to create a way to show anyone within the Backdrop community who wants to help with porting, how they can help with porting. I'm trying to streamline the barrier between knowing exactly what needs ported vs guessing what is needed. It's literally a task list for anyone who is interested in helping port modules.


Regarding a combine vs separate approach, I really want to push for a separate approach for several reasons: 1) This is temporary data collection (2-5 years) that will eventually go away. There's no need to make it fancy. It has one job and only needs to do that one job well.

2) This can be focused on and implemented in a very short amount of time compared to what is needed for the telemetry initiative. We wouldn't have to worry about adhering to api's or building out a solid foundation to get this accomplished.

3) It keeps the data separate between the two telemetry points. This proposal is for collecting data from Drupal whereas the telemetry initiative is for collecting data from Backdrop.

@stpaultim I know you're busy and can't get to this idea any time soon. Your feedback is very valuable though and I greatly appreciate it.


Now, if we had a volunteer that was eager to write the code and make it happen, that would be less of a consideration.

I'm willing to take lead if anyone is willing to work with me to make it happen since I am unable to handle the coding side.

I think what I need from the core group is a) blessing to move forward and b) some direction on where to host the collection site. I've got no problem hosting it myself, but in my mind it would be best if hosted on the b.org servers.

docwilmot commented 3 years ago

First of all my utmost support to @philsward for the dedication to the cause. I agree with everything. I'll help where I can.

I agree it would be nice to have basic telemetry. We've been talking about telemetry for ages now without doing it and I really think a few thousand rows in a database is a worthy price to pay to get some action.

A simple button on the BUS report page that asks to "send your list of modules completely anonymously to Backdrop CMS.org, no other data is collected" etc would be easy. This could make a simple http request to an endpoint somewhere in Backdrop-land that records the list in a database.

I ported that module, would be happy to put in the button and the http request if someone could build an endpoint and a database table.

philsward commented 3 years ago

@docwilmot yay!

I'll come up with a list of things I would like to see incorporated on the BUS side and we can all hash out which items get included.

If we're all in agreement that the collection point should be separate from the Backdrop telemetry data, would you be able to create a small module on the endpoint site to collect the responses? Or, is there already a module (D7 or BD) that can accept a json or restful response? It might be as simple as porting and turning on provided the port is easy 😉

My thought, is to create a subdomain such as bus.backdropcms.org and plop a site down on it. Add an endpoint collector that receives the data from the various sites and stores it in the DB. From there we can use views to display the data and make it human friendly.

One thing to think about is "can we have the site automatically check github to see if a port has been created". If so, filter it from the list. So we may want to look at tying the server in with the git API. We don't want people working on a port when it's recently been created by someone else.

I'm not sure who manages the b.org domains, guessing @jenlampton or @quicksketch so we'll have to get their approval on adding a subdomain and spinning up a site.

klonos commented 3 years ago

...When a user is configuring the module, they would be asked to share an anonymous list of the Drupal modules that they are using for the purposes of helping the Backdrop Community port the most requested modules.

It would be a very straightforward and transparent request and in the interest of users to share the data. But, fully optional and opt-in (in my opinion).

Yup, I agree with that, and fully support this idea. ...now it comes down to time/resources, and getting things done 😅

What is the status of the upgrade module? Do we have buy-in from the maintainer? Are people actually using the upgrade module?

The Backdrop version of the module can (should?) be a separate project, so that we have full control over it. On top of that, hosting it on d.org will allow us to see its usage stats (development can still happen in GitHub, over at https://github.com/backdrop-contrib/backdrop_upgrade_status). The usage number should give us an idea of people/sites looking to move to Backdrop, and gauge interest.

...this data will only be useful during a time frame of 1-2 years.

Not sure about that. D7 EoL has been pushed to 2022, but Vendor Extended Support will be running till November 2025.

...it might make sense to keep this project completely separate from the current core telemetry initiative. I'm still not sure on the technical considerations.

I believe that there will be a new core module called Telemetry, which will be sending data to a server/database (having it as a separate module will allow people to disable it completely). That core module should not be confused with the Telemetry initiative, or the Telemetry server collecting the data.

If we're all in agreement that the collection point should be separate from the Backdrop telemetry data...

Unless there are technical limitations, I think that both the core Telemetry module, as well as the Backdrop Upgrade Status D7 contrib module could be able to push data to the same Telemetry server. I would like us to avoid duplication of work, by avoiding the need for 2 separate endpoints that collect data.

...and they see a list of the top modules that still need ported and some of those are used by that user, it might encourage them to jump in and port them.

Or it could be a trigger to test alpha/beta versions and provide feedback.

... Backdrop is very small. 1500 vs 100,000+

Not sure where you are getting your stats from @philsward, but https://www.drupal.org/project/usage/drupal shows more like 700k D7 sites ...assuming you were referring to D7 sites out there.

If we can get Backdrop Upgrade Status installed on 10,000 Drupal 7 sites, I would feel very accomplished with this project

Ditto 👍

...we can use the BUS module to subtly be in front of D7 users by showing a monthly admin site message status of "There were X more modules added to Backdrop CMS since last month and Y of them you use. [link] Check the status of your site [/link].

Loving this idea! 👍

First of all my utmost support to @philsward for the dedication to the cause. I agree with everything. I'll help where I can.

Same here ❤️

stpaultim commented 3 years ago

(If we can get Backdrop Upgrade Status installed on 10,000 Drupal 7 sites, I would feel very accomplished with this project)

I love this optimism. I suppose my skepticism comes from my own sense that we'll be very lucky to see a few hundred people actually use this module/feature. I hope I'm way off. If I thought we could get 10,000 people using this module, I'd consider it a much higher priority on my own list of things to do.

If there are folks with the skills to implement this available and interested to work on it, I don't see any harm in trying it. I like the idea and hope it works, I'll do what I can to support it.

philsward commented 3 years ago

Heh mobile app went haywire...

Throwing this out there:

https://www.ostraining.com/blog/drupal/services/

I believe the services module will receive data as well as send it.

I'll play around with it when I get a chance, but it might be over my head.

I think if we can find an existing D7 module that will work as an endpoint to receive a json or XML file, we can either just run the endpoint site on D7 or port it to BD. Outside of the EoL for D7, there's no reason the endpoint couldn't be a Drupal site for a very low overhead, fast deployment solution.

@klonos I'll try to get your comments addressed in the next day or so.

philsward commented 3 years ago

Great feedback!

@klonos said:

this data will only be useful during a time frame of 1-2 years. Not sure about that. D7 EoL has been pushed to 2022, but Vendor Extended Support will be running till November 2025.

Where I was going with the 1-2 year time frame is that I believe that many of the major modules that are left, can be ported over a 1-2 year time frame. I envision the data will be collected and used over the next 4-6 years, but it will be the most useful over a 1-2 year period.

Unless there are technical limitations, I think that both the core Telemetry module, as well as the Backdrop Upgrade Status D7 contrib module could be able to push data to the same Telemetry server. I would like us to avoid duplication of work, by avoiding the need for 2 separate endpoints that collect data.

Ok, so part of me totally agrees, but the other part of me also totally disagrees.

I do like the idea of a single endpoint to send data to and like mentioned, reduced duplication of work. However, what I fear is that it will take a long time to build the foundation of a single endpoint site because its focus needs to be well thought out and robust for the Telemetry initiative to scale for the future. I have a feeling the Telemetry endpoint site will be a fairly complicated endeavor. In addition, we will have data for two different platforms in there which muddies the waters.

In contrast, if we split them into two sites and two different databases (my vote), we can create a very basic, crude setup that doesn't need to scale for the future. Over time its usage will go down and eventually that data and site can go away. By splitting the into two sites, any disruptive changes that need to be made on the BD endpoint, won't affect the D7 endpoint.

Not to say this is a blocker for me, but the way I would agree to a single endpoint site is if the work required to do this second site is extremely involved. In that case then yeah, focus on a single endpoint.

Backdrop is very small. 1500 vs 100,000+

Not sure where you are getting your stats from @philsward, but https://www.drupal.org/project/usage/drupal shows more like 700k D7 sites ...assuming you were referring to D7 sites out there.

When @stpaultim mentioned getting data by polling the Backdrop community, I was comparing people. I was also being modest with the 100k reference of people who use D7. I was mainly trying to point out that we have a much bigger opportunity to get in front of the D7 audience for the information we're trying to gather, than we do by getting in front of the BD community. Tim's suggestion about polling the BD community isn't right or wrong, it's just a numbers game.

philsward commented 3 years ago

Initial testing of the D7 Services module looks very promising.

It would need its own connector built for our specific use-case, but I was able to get it to create a node by sending it json data using a browser extension rest client (Boomerang on Chrome).

Json is apparently not good for cross-site scripting so we might need to use jsonp? I don't know enough about it to offer suggestion.

I'm tempted to open another issue (maybe two) and split discussion out between BUS and the site endpoint discussion to keep the focus between the two points separate. Thoughts?