InstaPy / InstaPy

📷 Instagram Bot - Tool for automated Instagram interactions
GNU General Public License v3.0
16.79k stars 3.77k forks source link

Proposal: Advanced logging for performance analysis #1480

Open dkbast opened 6 years ago

dkbast commented 6 years ago

tl;dr what this is good for:

Userstory: Let's say you have 30 hashtags from the same niche where you run "like by tag". I aim for 300 interactions so I set the "amount" to 10. To get the most out of my 300 interactions I would like to single out those tags, that are often used together (e.g. "Image already liked") and also I want to track my conversion rate (new followers / number of posts interacted) for each tag. This would allow me to set a different interaction amount for different hashtags (some get thousands of new posts every day, some only one). Also I want to run my script multiple times a day, so the first run should be over by the time my second run is scheduled (otherwise only the first tags in my list get interacted with).

Implementation: For each interaction I need to log the prospects username(or ID) the source (#hashtag @user - own feed should log under own name) as well as the action (/following /commenting /liking), to be able to split-test comments I would suggest logging the comment text as well. I would propose a table like this:

Time ProspectID PostID Source Follow Like Comment CommentText AlreadyLiked
YYYY-MM-DD HH:MI:SS samkolder AfbZ6mOBKrz #travel True True True "I like your stuff" False
YYYY-MM-DD HH:MI:SS travelfeels BfbZ6mOBKrx @samkolder False True True "Awesome" False
YYYY-MM-DD HH:MI:SS samkolder AfbZ6mOBKrz #lifestyle False False False "" True

Logging should have as little logic as possible. In case of an "Already liked" I would just log that and in the analytics part the missing data can be inserted by querying for PostID="x" AND AlreadyLiked="False".

Now all that has to be done is periodically scrape the own followers into another database and the data-science magic can begin. This is a research project after all ;)

Outlook: Having this data at hand we can reduce unnecessary calls to instagram. We can also figure out which hashtags help us grow our channels and which are ballast.

Before I start working on the implementation I would like to have feedback on the suggested table from @timgrossmann @converge and @sionking

sionking commented 6 years ago

GREAT GREAT initiative! (like you read my mind, just added today to my todo list!) I can put a bounty if you like :) I will read it all later, and comment with details.

SKC

sionking commented 6 years ago

please add time and date for the table. Will comment more tmrw

CharlesCCC commented 6 years ago

@dkbast Great idea, one more thing I would suggest you add to the table is FolllowedBack flag.

dkbast commented 6 years ago

@CharlesCCC please clarify the use of the flag. How would you use this to measure conversion? Is it me who follows back? I don't see the use for that, but Im open for suggestions. IMHO logging if the user followed back should be done in a separate table, or to be more precise it is the result of the comparison of the logging table with the current followers. This is consistent with the "no-logic-in-logging" paradigm.

CharlesCCC commented 6 years ago

@dkbast As you first bullet said, measuring conversion rate Based off my understand, the conversion rate is measured by follower not likes, right ? that is the ultimate goal, got someone to following you. so, if you have the FollowedBack flag set, then you can use this to analysis based off the tag, date/time that being followed to optimize your configurations.

sionking commented 6 years ago

I think @CharlesCCC mean when you follow someone then if he also follows you it should rise FollowedBack flag. But how can we measure this ? we shall run over all users we interact with and check if they following us or run over our follwed-by users and match it with the table then to rise the flag.

Can be done nicley with sets

sionking commented 6 years ago

Btw you can use "black list" method it has 70% of what you want to implement

CharlesCCC commented 6 years ago

@sionking You are correct. That is exactly what I meant. If you guys noticed there is two choice we can use/determine if a user has followed back. 1, we will get a notification from the web page saying who is following you; 2, when someone is following you (and you haven't follow him/her) the "Follow" will become "Follow Back"

keusta commented 6 years ago

great idea, would add in this table

sionking commented 6 years ago

BTW you can use matplotlib for charts

imjustin commented 6 years ago

I came across this while looking up different logging; I had some thoughts of my own on potential analytics. This is definitely an interesting approach. but if you're interacting with hundreds of users per day couldn't this get unwieldy very quickly?

i was thinking something like logging the number of actions / type of action every time the script is run, and then doing some kind of regression analysis to identify which action is the highest yield in terms of growth (growth metric would be followers)

leoch20 commented 5 years ago

I have been thinking about how to implement this functionality. It would be soo useful.

I think this can be achieved by calling table_logger https://pypi.org/project/table-logger/ when self.logger.info is called. From what I've seen, it would required parsing out text from the logs in order to make it table friendly.

I have a few noob questions (pyhton/instapy newbie).

  1. Is there a way of calling table_logger once, let's say at # initialize and setup logging system for the InstaPy object

  2. What is there a most efficient way of parsing out the log text to get only the wanted info, Ex. -> INFO [2018-10-21 22:21:55] [account] --> Followed 'b'someusername''! would retrieve '1' for column 'Followed' and someusernamei for column 'Username'

I really want to get this working. If anyone has any tips about what the best possible approach is, please let me know!