Closed rviscomi closed 2 years ago
@bobbyshaw - I got all of them except for these 3. Can you let me know where you saw them?. I did a dump of all the categories and I don't see them
Top “reviews” technology category Top “Translation” technology category Top “Buy now / pay later” technology category
For hreflang
, Iam able to identify sites that use it using the query below (sample set only). What are you looking to collect from here?
SELECT
DISTINCT page,
FROM
`httparchive.sample_data.response_bodies_*` rb
WHERE
REGEXP_CONTAINS(body, "hreflang")
AND EXISTS (
SELECT
url,
category
FROM
`httparchive.sample_data.technologies_*` ht
WHERE
rb.page = ht.url
AND ht.category = "Ecommerce" )
GROUP BY
page
Same with the CSP, I haven't yet figured out the query here. Are you trying to report the % of Ecommerce sites that use the CSP header?
There were a couple of other minor potential discussion points in the outline. Are either of these feasible? Use of link hrelang tags (to indicate international ecommerce) Use of content-security-policy header set? (report only/enforce)
I might doing something wrong but when I ran that query without the category
filter, it took so long (5 min+). I wonder if its because of the number of categories. Even just ecommerce
as a filter takes ~1 min.
I think this would be enough to be a point in the article. Basically this query without the category filter would tell us whether ecommerce sites are on average over or under performing the rest of the web.
@rrajiv Hey, no problem. I was looking here:
Perhaps they're only available in a newer version of wappalyzer?
For hreflang,[...] What are you looking to collect from here?
In the first instance, a % of sites that contains a hreflang tag. A nice to have a count per number of hreflang tags, e.g. X have 0 hreflang tag, Y have 1 hreflang, Z have 2 hreflang and so on.
I found a couple of older queries related to hreflang's if it helps at all:
Are you trying to report the % of Ecommerce sites that use the CSP header?
Yes, a statistic on the % that have the "Content-Security-Policy" or "Content-Security-Policy-Report-Only" header would be great.
I might doing something wrong but when I ran that query without the category filter, it took so long (5 min+)
Ok, don't worry about that query, that's not worth it :)
@bobbyshaw - I see Reviews
and Translations
appear in httparchive.technologies.2021_08_01_*
. I don't see Buy now pay later
perhaps because that category hasn't been seen in the wild yet?. httparchive.technologies.2021_07_01_*
does not have these 3 new categories.
I'll work on the other 2 queries sometime this week.
@rrajiv / @bobbyshaw - I added 'Reviews' / 'Translation' / 'Buy now pay later' / 'Loyalty & Rewards' category very recently in Wappalyzer. Even if you see some data, it will be very limited. I think we should look at these next year.
@rrajiv - Topic of hreflange was covered last year in SEO chapter - https://almanac.httparchive.org/en/2020/seo#hreflang. You should be able to find relevant queries from last year and filter just for Ecommerce.
I believe these are the queries you need -
https://github.com/HTTPArchive/almanac.httparchive.org/blob/main/sql/2020/seo/pages_wpt_bodies_hreflang_by_device_and_http_header_value.sql https://github.com/HTTPArchive/almanac.httparchive.org/blob/main/sql/2020/seo/pages_wpt_bodies_hreflang_by_device_and_link_tag_value.sql
@rockeynebhwani and @bobbyshaw - thank you for the pointers.
I have added the following to the excel sheet
hreflang
through link attribute for Ecommerce
hreflang
through headers for Ecommerce
Content-Security-Policy
and Content-Security-Policy-Report-Only
on Ecommerce
sitesGreat, thanks @rrajiv. I'll review the results spreadsheet and get started!
Hey, a quick update. I’m a bit behind on digesting the analysis and writing the draft but I’ve started in earnest yesteday. I hope to have something to review in a week’s time.
I’ve got a couple of questions so far. Would you be able to help @rrajiv? I appreciate you said you would be travelling so I don’t expect a quick response.
There’s a sharp rise in ecommerce platforms (215 vs 145 last year). I'd expect some rise as more platforms are added to Wappalyzer but there are a number of technologies in there that I wouldn’t consider to be ecommerce. I’ve checked Wappalyzer signatures but they didn’t seem to be in the ecommerce category nor imply cart functionality. Do you know why that might be?
Examples of anomalies in the top vendors tab are:
One other platform I’m not sure about is 1C-Bitrix. It’s a Russian software suite that has an ecommerce product within it but not as a core component to it. We included it last year so I’d be interested in your thoughts @rockeynebhwani. Is it fair to include it in the top 10 list when it’s likely that actually, a much smaller proportion of all 1C-Bitrix sites are ecommerce? I guess we can't discount or adjust its position as any adjustment would be based on an assumption. I think you had a similar problem in the past with Wix though that seems to have specific ecommerce signatures now.
I’ve also started to read through the figure guide on how to create charts but I may need some help. In the first instance, I’ve added
@bobbyshaw - I won't worry about sharp incrase in number of ecommerce platforms. I personally would have contributed 30 different platfrorms to Wappalyzer since last year. As of today, Wappalyzer is tracking 264 different ecommerce platforms. You can see latest count on this page - https://www.wappalyzer.com/technologies/ecommerce.
I personally observed that technologies analysis for CMS/ecommerce is more skewed towards North America. In last 12 months, I added many different platforms from Korea / Latin America / India and other countries from Europe. That may be one of the reason.
Regarding technologies like Loox, Omnisend etc, it's a problem due to open source nature of Wappalyzer. Anybody can add a technolgoy and assigned to ecommerce category where they can't find another appropriate category. For example, Loox is an app for reviews but there was no category for reviews till very recently so contributors by default choose Loox. However, in many cases, this resolves itself and new categories are introduced over time (For example - Loox has been categorised under 'Reviews' category now). I checked all examples in your comment and none of them are not categorised under ecommerce. You can search for these on link I shared above. You are looking at latest Wappalyzer signatures on GitHub whereas queries output are from July-2021 and these were updated after July-2021. For the purpose of top 10 platforms, I suggest you ignore these.
@bobbyshaw - Regarding 1C-Bitrix, I am not very familiar with this platform and I didn't realise this last year. Yes.. it's same issue as 'Wix'. This year, I was able to get in touch with 'Wix' team and make changes to Wappalyzer to split Wix detection as 'Wix' (CMS) and 'Wix commerce'. We should do the same with 1C-Bitrix if there is a way to identify. For now, I suggest you add this as a caveat as I did for 'Wix' last year.
@bobbyshaw - This is the most recent discussion I could find on 1C-Bitrix. As of now, we don't know how to differentiate between CMS and commerce sites. - https://github.com/AliasIO/wappalyzer/pull/4157
Thanks @rockeynebhwani. That's really helpful.
@bobbyshaw - Iam still on the road but if you want to let me know the questions, I can answer when possible.
If you need charts let me know the tabs and I can try it from the iPad.
@bobbyshaw - I will also be on the move for next 4 weeks but I can try to help with the charts. @rviscomi - I don't have edit access on results sheet. Can you please grant me 'edit access'?
@rockeynebhwani can you hit "Request edit access"?
@rviscomi - I already did couple of days ago but have done again now.. let me know if you don't receive my request
@bobbyshaw - I have created all charts in results sheet. Please have a look and let me know if I missed anything or if anything is not clear.
That's incredible, thanks @rockeynebhwani 🤩
@bobbyshaw do you think I can start reading for the review?
Thanks for your patience team. I can now offer my very rough first draft for review. Given that days are passing quickly feel free to review at your earliest convenience and I will respond to each as and when I can.
@rockeynebhwani @fili @samdutton @alankent @soulcorrosion (@shantsis I'm not sure the appropriate time for an editor to get involved but tagging you as a heads-up anyway).
Overall, I think we’ve found the ecommerce landscape to be very similar to last year. However, we do have a couple of new discussion opportunities, particularly with the ranking data. There was some rapid growth around Q2-3 last year when COVID hit but the growth rate appears to have returned to pre-pandemic levels.
In terms of what we’ve covered. We came up with so many topics during the outline, which is great. It’s fair to say that we ddn’t get through them all! There was some that we just didn’t get around to doing in the depth that was suggested, e.g. SEO, and others that weren’t practical because of lack of data, e.g very few personalisation technologies.
In terms of limitations, I think going forward headless sites are going to cause us the most trouble. Even in this year’s edition, it would have been nice to have more to say on this trend. While I’m sure a lot fewer people are going headless than the buzz would suggest, the easiest and sometimes only way for us to detect a platform is through its frontend markup choices.
Over the next week, my plan is to:
For any other questions or longer discussions not suited to here or the Google Doc, you're welcome to find me #web-almanac-ecommerce
slack channel
Thanks for the update. I will have a look at it in the coming week and get back to you.
On Fri, Nov 5, 2021, 18:17 Tom Robertshaw @.***> wrote:
Thanks for your patience team. I can now offer my very rough first draft https://docs.google.com/document/d/1LQjpsaWx-5ZtHQGRnHlPnekkxuap50KzJZJTIaSX4B4/edit#heading=h.l58oy8wsputh for review. Given that days are passing quickly feel free to review at your earliest convenience and I will respond to each as and when I can.
@rockeynebhwani https://github.com/rockeynebhwani @fili https://github.com/fili @samdutton https://github.com/samdutton @alankent https://github.com/alankent @soulcorrosion https://github.com/soulcorrosion @.*** https://github.com/shantsis I'm not sure the appropriate time for an editor to get involved but tagging you as a heads-up anyway).
Overall, I think we’ve found the ecommerce landscape to be very similar to last year. However, we do have a couple of new discussion opportunities, particularly with the ranking data. There was some rapid growth around Q2-3 last year when COVID hit but the growth rate appears to have returned to pre-pandemic levels.
In terms of what we’ve covered. We came up with so many topics during the outline, which is great. It’s fair to say that we ddn’t get through them all! There was some that we just didn’t get around to doing in the depth that was suggested, e.g. SEO, and others that weren’t practical because of lack of data, e.g very few personalisation technologies.
In terms of limitations, I think going forward headless sites are going to cause us the most trouble. Even in this year’s edition, it would have been nice to have more to say on this trend. While I’m sure a lot fewer people are going headless than the buzz would suggest, the easiest and sometimes only way for us to detect a platform is through its frontend markup choices.
Over the next week, my plan is to:
- Compare to last years for any further commentary that could be made.
- Respond to all of your feedback and correction and update as appropriate.
- Re-read the author guide and style guide and re-draft with that in mind.
- Read through the next steps for converting to markdown and get started
For any other questions or longer discussions not suited to here or the Google Doc https://docs.google.com/document/d/1LQjpsaWx-5ZtHQGRnHlPnekkxuap50KzJZJTIaSX4B4/edit#heading=h.l58oy8wsputh, you're welcome to find me #web-almanac-ecommerce slack channel
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HTTPArchive/almanac.httparchive.org/issues/2155#issuecomment-962073195, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA4EP57EKPEQ5IOPPUZDKTUKQGQZANCNFSM43UFNOMA .
Just wanted to say I think the start is coming together well @bobbyshaw (and others)! The end still needs work (not finished). I finished a complete pass through. Feel free to mention me on this thread again later if you want me to make another pass.
I did a first pass through the doc. Main thing to be careful of is use of past tense (for our analytics) vs present (current state of web), and use of British vs US spelling https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide
The other thing to note is that there are a lot of charts using only green bars (not desktop related) that are leading to poor contrast. @tunetheweb suggests we can either use the dark gray color instead for both for bar and label, or just the label
Other option is to use black labels. But find they look better as inside labels (something just seems "off" when they are outside labels for green bars):
Thanks, everyone. I've incorporated all feedback, including the chart suggestions and Americani~s~zation 🙂
I'm going to take a break and come back in a few days to start the process of converting to markdown. I'll do my best to incorporate any final comments made during that period.
Thanks, again.
@bobbyshaw @rockeynebhwani @fili @samdutton @alankent @soulcorrosion @rrajiv @shantsis
🎉 This chapter is fully written, reviewed, edited, and ready to be launched on Wednesday! Thank you to all of the contributors who put in the time and effort to make this a great chapter.
When you get 5 minutes, I'd really appreciate if you could fill out our contributor survey to tell us (the project leads) about your experience. It's super helpful to hear what went well or what could be improved for next time. 🙏
Congratulations and thank you all again. I'm excited for this to launch soon!
Part III Chapter 17: Ecommerce
If you're interested in contributing to the Ecommerce chapter of the 2021 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor.
Content team
Expand for more information about each role
- The **[content team lead](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Content-Team-Leads'-Guide)** is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress. - **[Authors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide)** are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report. - **[Reviewers](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide)** are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases. - **[Analysts](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide)** are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly. - **[Editors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide)** are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit. - The **[section coordinator](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Section-Leads'-Guide)** is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule. _Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors._ For an overview of how the roles work together at each phase of the project, see the [Chapter Lifecycle](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle) doc.Milestone checklist
0. Form the content team
1. Plan content
2. Gather data
3. Validate results
4. Draft content
5. Publication
Chapter resources
Refer to these 2021 Ecommerce resources throughout the content creation process:
📄 Google Docs for outlining and drafting content 🔍 SQL files for committing the queries used during analysis 📊 Google Sheets for saving the results of queries 📝 Markdown file for publishing content and managing public metadata