Closed raiym closed 6 years ago
Is this issue due to a modification at instagram side? Will it affect media fields or other access methods?
is it fixed now?
Not yet
ok, looking forward to hearing good news from u
Hi team! Instagram data format changes fixed in pull request https://github.com/postaddictme/instagram-java-scraper/issues/51 but I think that media model should be changed to reflect fields and structure of graphql .
@igor-suhorukov could you please explaing more? What do you mean under: "reflect fields and structure of graphql"? Why do we need it?
@Mehrshadow it fixed by @igor-suhorukov
tnx 👍
What do you mean under: "reflect fields and structure of graphql"? Why do we need it?
@raiym current data model do not reflect all graphql fields. As example edge_media_preview_like collection in https://www.instagram.com/p/{post}/?__a=1 (preview list of like) also in graphql only single video quality/resolution are present and so on.
Also current data model not useful in terms of data analysis with java 8 stream API (fields getter are absent, so method references not available). Current gson to domain object mapping are boilerplate, error prone and hard to support - as reported by SonarQube metrics and based on my experience.
User to persist data model must copy object to another, but very similar data model. As an example https://github.com/igor-suhorukov/instagram-statistics/tree/master/src/main/java/com.github.suhorukov.instagram.statistics/model
It not so comfortable and looks like boilerplate code in case of similar existing instagram-java-scraper POJO.
I can help project to rewrite model/data mapping layer in more supportable code style and convenience to use case.
@igor-suhorukov thank you for your detailed response.
I support your opinion and I think that model data structure must be as close as possible to official instagram data structure (we plan to support official instagram api within this library as well). And at the same time note that in different queries the same object may be in different json formats and with different field names (look at this parse method https://github.com/postaddictme/instagram-php-scraper/blob/980b1ce6b0c263fe754de03e325fbf17ac0057e7/src/InstagramScraper/Model/Media.php#L359)
Why is gson boilerplate? Do you have other options to consider?
About getters and setters yes, better to add.
If you have time to start refactoring I am happy with it.
I am planning to contribute on weekends
@raiym thank you for response.
Why is gson boilerplate? Do you have other options to consider? Because of too many explicit unsafe type conversion in instagram-java-scraper code for each properties.
There are different options to map json to POJO objects in java without boilerplate mapping code:
About getters and setters yes, better to add.
Lombok help us to get rid boilerplate code for Getters/Setters/ToString/Equals/HashCode etc
I will provide java object model to you for review and feedback
Regards, Igor
And at the same time note that in different queries the same object may be in different json formats and with different field names
@raiym we just need to support different mapping for the same object model
Hello @raiym
Please review new data model. Based on graphql as close as possible: data model
To match json data format: json dumped by current test pack me.postaddict.instagram.scraper.AnonymousInstaTest
Dear @igor-suhorukov thank you for your hard work.
I will review it and respond in two days (or sooner).
more details for you @raiym
graphql Instagram data model rendered as relational DB structure:
@igor-suhorukov thank you for valuable contribution. I have reviewed new data model and I am OK with the most part.
Here the things that I want to point out before the merge:
me.postaddict.instagram.scraper.domain
package untouched until we implement all the methods that current version has. edge_media_to_tagged_user
)private List<Account> likes;
private List<Comment> comments;
Sincecomments
and likes
contains only first ~10-20 nodes developers. I think we need to rename it like: lastComments
and lastLikes
or firstComments
and firstLikes
.
We have special methods to get comments (and likes) (https://github.com/igor-suhorukov/instagram-java-scraper/blob/79527dea0c857834ed37cb1ab4c1f39da7a0889d/src/main/java/me/postaddict/instagram/scraper/Instagram.java#L267)So in short I am ready to merge your pull request after 1. and 5. will be changed. 2. 3. and 4. we can change after PR merged. We can discuss any point in details if you disagree or have something to say.
Hi @raiym thank you for review and feedback!
I will map json to new data model and integrate it into existing solution.
My comments regarding your review:
Hi @raiym and @igor-suhorukov! Try Gson from Google. this library has the plus in that it does not require 100% json to the pojo class. Some fields may be missing, and extra ones may be present, this does not prevent you from calling objects. and this will greatly help, since in different queries the same object may be in different json formats and with different names. plus if an extra field suddenly appears, this will not lead to errors.
Hi @raiym We in two steps from pull request - code refactoring and tests polishing still needed. Could you please look at https://github.com/igor-suhorukov/instagram-java-scraper and give your feedback.
Json to domain mapping are pluggable and all mapping described as moxy *,json rules src/main/resources/me/postaddict/instagram/scraper/model and ModelMapper class.
@kovalchuk11 thank you for information. Gson are already used as mapper in instagram-java-scraper, I hope in future json to model mapping will be pluggable...
Fixed issue with getMedias() but need to fix parsing methods. https://github.com/postaddictme/instagram-java-scraper/commit/283b1b7d29d0f27f4db4f87d6dbc506523e7d8e6
Will fix it in 24 hours