WikiTeam / wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.
https://github.com/WikiTeam
GNU General Public License v3.0
705 stars 147 forks source link

Fandom - Add support for comments #456

Open upintheairsheep opened 1 year ago

upintheairsheep commented 1 year ago

Fandom's API has a controller that grabs the comments of a page, which is paged. The comments cannot be accessed traditionally through the "AllPages" section as I have thought. The comments seem to be retained even after moderators turn off comments on a wiki. Example: https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=8 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=7 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=6 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=5 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=4 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=3 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=2 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false&page=1 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=false https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=0 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=1 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=2 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=3 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=4 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=5 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=6 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=7 https://phobia.fandom.com/wikia.php?controller=ArticleCommentsController&method=getComments&title=Diagraphephobia&namespace=0&hideDeleted=true?page=8

This may be more suitable for a WARC with ArchiveBot than a WikiDump.

nemobis commented 1 year ago

Once upon a time, these were just normal pages included in the XML dump, though it took millennia to download them one request at a time. Have you checked?

nemobis commented 1 year ago

Some context https://community.fandom.com/f/p/3650285180289026549

The above answers are based on how the old migration script that we used in the past worked. For the upcoming forum retirement, we'll have to update that script