Closed nrllh closed 3 weeks ago
Happy to join as I did in the past.
I am fine with author role.
Andrea Volpini
WordLift https://wordlift.io
On Thu, 4 Apr 2024 at 21:25, Nurullah Demir @.***> wrote:
Structured Data 2024
[image: Structured Data illustration] https://raw.githubusercontent.com/HTTPArchive/almanac.httparchive.org/main/src/static/images/2021/structured-data/hero_lg.jpg
If you're interested in contributing to the Structured Data chapter of the 2024 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide, reviewer https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide, analyst https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide, and/or editor https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide. You might be interested in exploring the changes to this year's version here https://github.com/HTTPArchive/almanac.httparchive.org/discussions/3619. Content team Lead Authors Reviewers Analysts Editors Coordinator
@cyberandy https://github.com/cyberandy - - - - Expand for more information about each role π
- The content team lead https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Content-Team-Leads'-Guide is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress.
- Authors https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report.
- Reviewers https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases.
- Analysts https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly.
- Editors https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit.
- The section coordinator https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Section-Leads'-Guide is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule.
Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors.
For an overview of how the roles work together at each phase of the project, see the Chapter Lifecycle https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle doc. Milestone checklist 0. Form the content team https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle#0-create-content-team
- π April 15 Complete program and content committee - π Organizing committee
- The content team has at least one author, reviewer, and analyst.
Plan content https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle#1-plan-content
- π May 1 First meeting to outline the chapter contents - π Content team
- The content team has completed the chapter outline.
Gather data https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle#2-gather-data
- π June 1 Custom metrics completed - π Analysts
- Analysts have added all necessary custom metrics https://github.com/HTTPArchive/custom-metrics/blob/main/README.md and drafted a PR (example https://github.com/HTTPArchive/almanac.httparchive.org/pull/1087) to track query progress.
- π June 1 HTTP Archive Crawl - π HA Team
- HTTP Archive runs the June crawl.
Validate results https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle#3-validate-results
- π August 15 Query Metrics & Save Results - π Analysts
- Analysts have queried all metrics and saved the output.
Draft content https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle#4-draft-content
- π September 15 First Draft of Chapter - π Authors
- Authors has written the chapter.
- π October 10 Review & Edit Chapter - π Reviewers & Editors
- Reviewers and Editors has processed the the chapter.
Publication https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle#5-publication
- π October 15 Chapter Publication (Markdown & PR) - π Authors
- Authors has converted the chapter to markdown and drafted a PR.
- π November 1 Launch of 2024 Web Almanac π - π Organizing committee
Virtual conference <#m9177394639622278082>
- π November 20 Virtual Conference - π Content Team
Chapter resources <#m_9177394639622278082_chapter-resources>
Refer to these 2024 Structured Data resources throughout the content creation process: π Google Docs https://docs.google.com/document/d/1DIe6aPWqzYIZsZ-ZdehLkPVuezAHdrMffY_U4BuX0s0/edit for outlining and drafting content π SQL files https://github.com/HTTPArchive/almanac.httparchive.org/tree/main/sql/2024/structured-data/README.md for committing the queries used during analysis π Google Sheets https://docs.google.com/spreadsheets/d/1GWniSGupK6KgME7urV7ff0iWStzopGXqnQvJ3_-ynD4/edit#gid=1778117656 for saving the results of queries π Markdown file https://github.com/HTTPArchive/almanac.httparchive.org/tree/main/src/content/en/2024/structured-data.md for publishing content and managing public metadata π» Collab notebook https://colab.research.google.com/drive/1cYFlIVQAHbIAUua7pUcXlfQfYMFRwRVH for collaborative coding in Python - if needed π¬ #web-almanac-structured-data https://join.slack.com/t/httparchive/shared_invite/zt-45sgwmnb-eDEatOhqssqNAKxxOSLAaA on Slack for team coordination
β Reply to this email directly, view it on GitHub https://github.com/HTTPArchive/almanac.httparchive.org/issues/3594, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGMLLMQMW7OHKB2Z6WJGUDY3WSLJAVCNFSM6AAAAABEDI5NVCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3DIOJRG44DOMQ . You are receiving this because you were mentioned.Message ID: @.***>
Thank you, @cyberandy!
Hey @JohnBarrettWDW @SeoRobt @jasonbellwebdataworks @jonoalderson @JasmineDW - awesome contributors from previous years π Are you interested in joining us again this year?
Hey Nurullah,
Thanks for reaching out, unfortunately I wonβt be able to participate this year due to some personal timing constraints. Plan on returning in 2025, cheers until then!
From: Nurullah Demir @.> Date: Tuesday, April 9, 2024 at 6:36 PM To: HTTPArchive/almanac.httparchive.org @.> Cc: Rob Teitelman @.>, Mention @.> Subject: Re: [HTTPArchive/almanac.httparchive.org] Structured Data 2024 (Issue #3594)
Hey @JohnBarrettWDWhttps://github.com/JohnBarrettWDW @SeoRobthttps://github.com/SeoRobt @jasonbellwebdataworkshttps://github.com/jasonbellwebdataworks @jonoaldersonhttps://github.com/jonoalderson @JasmineDWhttps://github.com/JasmineDW - awesome contributors from previous years π Are you interested in joining us again this year?
β Reply to this email directly, view it on GitHubhttps://github.com/HTTPArchive/almanac.httparchive.org/issues/3594#issuecomment-2046151304, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AT4JQSO47FKDUD6YLFIHIE3Y4RUNPAVCNFSM6AAAAABEDI5NVCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBWGE2TCMZQGQ. You are receiving this because you were mentioned.Message ID: @.***>
Unfortunately I'm likely to be tied up with other commitments! :(
@jvandriel would be a great contributor!! Sorry @jonoalderson and @SeoRobt not to have you in the crew this year.
Sounds good to me @cyberandy.
Sounds good to me @cyberandy.
Great! Would you like to contribute as analyst? @jvandriel
I fear the analyst role is beyond my skillset @nrllh (I never learned SQL). Reviewer probably suits me best.
All right, thank you!
I am happy to contribute as an editor. I am a professional technical writer (4+ years) and community contributor to various open standards initiatives (microformats2, W3C Social Web Community Group).
I am interested in being involved as a reviewer as well.
Dear @jvandriel @rrlevering and @capjamesg, would you be up for a quick call to review the outline of this year's edition of the SD Chapter? I created a Doodle for this if you like the idea: https://doodle.com/meeting/participate/id/b8OMl9la
@cyberandy Thank you for the link! I am unavailable next week, but I can meet any week after that. If there is a document with the outline that I can review, please send it over and I can provide async feedback.
Thanks @capjamesg I have added some additional slots for the week after (same link > https://doodle.com/meeting/participate/id/b8OMl9la) and of course, don't worry if you can't make it. I will share here the link of the outline once ready. On another note, @nrllh when will the data be available?
Thanks @capjamesg I have added some additional slots for the week after (same link > https://doodle.com/meeting/participate/id/b8OMl9la) and of course, don't worry if you can't make it. I will share here the link of the outline once ready. On another note, @nrllh when will the data be available?
It's cool to see the progress! The data will be available by this Friday.
@cyberandy the results are already in the sheet. Please check. The JSON-LD relationships and timeseries comparisons should be completed; the other results areΒ alreadyΒ there.
thanks @nrllh, when do you think the time series comparisons can be completed?
Today ;) I'll ping you in the next few hours
thanks @nrllh, when do you think the time series comparisons can be completed?
The data is there, except for the last two figures. I'm still working on them.
Hello! I am interested to help eg as a reviewer...
I sent the memo of yesterday's meeting and added also @danbri in the loop. Here is a new doodle for the next check point: https://doodle.com/meeting/organize/id/eXnnLJAb/preview π
@nrllh a few questions that came up yesterday:
Many thanks in advance.
just a confirmation that the data focuses, also this year, on the top-level page (home page) only
We analyze both home pages and inner pages, but when reporting, we do so at the site level. This means we do not count a site more than once.
will it be possible to extract structured data from Initial HTML (before JS execution) and compare it with structured data extracted from the Final DOM (after JS execution)? This way we could see what are the common practices on this front.
We have access to the response of all requests, including the root page's response. Could you please provide more details on what you exactly want to compare?
Thanks @nrllh for the clarifications. If my understanding is correct structured data from inner pages is aggregated with the top-level page.
Regarding the second point, the key aspect weβd like to explore is the percentage of websites that rely on client-side JavaScript for structured data injection versus those that serve it directly from the server. Additionally, it would be insightful to analyze the correlation between the types of entities represented in JSON-LD and whether they are injected via JavaScript or delivered server-side. Many thanks in advance!
Dear @danbri @rrlevering @jvandriel @capjamesg please update the Google Doc in the next few days and I'll proceed with the opening of the PR for the markdown in the next few days.
I hope everyone can have a chance to review / contribute to the final document.
I have left comments in the linked PR.
Structured Data 2024
If you're interested in contributing to the Structured Data chapter of the 2024 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor. You might be interested in exploring the changes to this year's version here.
Content team
Expand for more information about each role π
- The **[content team lead](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Content-Team-Leads'-Guide)** is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress. - **[Authors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide)** are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report. - **[Reviewers](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide)** are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases. - **[Analysts](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide)** are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly. - **[Editors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide)** are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit. - The **[section coordinator](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Section-Leads'-Guide)** is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule. _Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors._ For an overview of how the roles work together at each phase of the project, see the [Chapter Lifecycle](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle) doc.Milestone checklist
0. Form the content team
April 15
Complete program and content committee - π Organizing committee1. Plan content
May 1
First meeting to outline the chapter contents - π Content team2. Gather data
June 1
Custom metrics completed - π AnalystsJune 1
HTTP Archive Crawl - π HA Team3. Validate results
August 15
Query Metrics & Save Results - π Analysts4. Draft content
September 15
First Draft of Chapter - π AuthorsOctober 10
Review & Edit Chapter - π Reviewers & Editors5. Publication
October 15
Chapter Publication (Markdown & PR) - π AuthorsNovember 1
Launch of 2024 Web Almanac π - π Organizing committee6. Virtual conference
November 20
Virtual Conference - π Content TeamChapter resources
Refer to these 2024 Structured Data resources throughout the content creation process: π Google Docs for outlining and drafting content π SQL files for committing the queries used during analysis π Google Sheets for saving the results of queries π Markdown file for publishing content and managing public metadata π» Collab notebook for collaborative coding in Python - if needed π¬ #web-almanac-structured-data on Slack for team coordination