jikan-me / jikan

Unofficial MyAnimeList PHP+REST API which provides functions other than the official API
https://jikan.moe
MIT License
851 stars 94 forks source link

Change anime synopsis xpath #496

Closed ToshY closed 1 year ago

ToshY commented 1 year ago

Fixes #495

ToshY commented 1 year ago

Unrelated tests failing

Edit

If possible tag a new release 🙏

pushrbx commented 1 year ago

We need to test whether the JSON serializer in Jikan API drops these newlines or replaces them with \n.

ToshY commented 1 year ago

Hey @pushrbx

Please add the same change for the other media types too.

Can you clarify which other media types? The MangaParser and AnimeEpisodeParser are also using the same XPath with meta tag, so I can update those too if that's what you're referring to.

pushrbx commented 1 year ago

Hey @pushrbx

Please add the same change for the other media types too.

Can you clarify which other media types? The MangaParser and AnimeEpisodeParser are also using the same XPath with meta tag, so I can update those too if that's what you're referring to.

I was referring to any other parser where there is any parsing of "description" or "synopsis" where the \n breakpoints would matter. That includes MangaPraser and AnimeEpisodeParser too.

ToshY commented 1 year ago

@pushrbx

After looking over at leftover parsers again, the only parsers that use meta tags for descriptions/synopsis/about are those 2 mentioned before.

The more I tested those parsers, the less I wanted to change them due the following findings:

  1. For manga:
    • With synopsis (example): XPath //span[@itemprop='description'] exists (note span here instead of p).
    • Without synopsis (example): XPath //span[@itemprop='description'] does not exist.
  2. For anime episode:
    • With synopsis (example): No simple XPath like anime of manga can be used directly. Synposis starts somewhere around //h2[contains(@class, 'fw-b')].
    • Without synopsis for existing episode (example): Contains both //h2[contains(@class, 'fw-b')] and //div[contains(@class, 'badresult al')].
    • Without synopsis for non-existing episode (example): Contains only //div[contains(@class, 'badresult al')].

Because each media type behaves differently (regarding synopsis), I think it would be more manageable to cut it into separate PRs, each one fixing it for a specific media type and updating its corresponding test.

In order to not broaden the scope of the initial issue, I want to keep the PR as is.

pushrbx commented 1 year ago

Understood, then it's fine like this. Just one more thing:

ToshY commented 1 year ago

@irfan-dahir is it possible to tag a new release for this?