Open avinashkranjan opened 3 years ago
Can I work on this issue under GSSOC21? Kindly assign it to me.
I've a suggestion. We can start by categorizing each script into a dictionary.. something like:
{"Scrappers": ["Project1", "Project2"], "Automation":["Project3", "Project4"]}
Each list can contain records of scripts that are in another category. This will assure that effiency of sorting and tag searching, and put each script into its suitable place making it organized. This dictionary can be saved in a CSV file for future adds/edits.
We can also make a function to act as an API which adds a new script with the suitable tags, and help contributors to automate the process of adding their scripts to this script instead of manually adding them. (Let's leave that for after we settle on how we will save the data of the existing scripts.)
Hey..! What if we create a script that takes folder names as arguments and lets the user decide which folder thier script is to be moved in to and also an option to add a newer category.
script.py folder_name category
or
script.py -n new_category folder_name [folder_name [folder_name]
Hi, @Ajay-Singh-Rana @XZANATOL @AshuKV,
As I have just returned home, I need some time to plan out the overall structure of the script, how it will handle new scripts and how we can have a GUI for that.
I will update you on this day after tomorrow
Any updates..?
Any updates..?
Will update by EOD
Hey @kaustubhgupta If Need any help or any doubt kindly text me on discord..Participants are asking about this Issue..
Hi, @AshuKV @XZANATOL @avinashkranjan @Ajay-Singh-Rana (Apologies for updating late, was busy with important work),
I liked the @XZANATOL implementation to create the database of scripts based on categories. I think a big JSON would do the work. Why JSON? Because it can be loaded into the dictionary in Python. Creating this JSON would be a manual task initially as we can't trace back to every merged pull request and check if it was a scrapper or ML script as neither the participants have filled the PR template properly nor there was an option to add a category. @Ajay-Singh-Rana Your idea has a problem that a new user would not be aware of the folder where the script lies.
Also, one major problem apart from these ideas is that all participants have different entry points for their script. Some of them have main.py, scrapper.py, or script-specific names.
Here is my workaround for this issue in 4 tasks (Please read till the end before assuming why I am asking these):
Creating the manual JSON: This JSON file, as suggested by @XZANATOL will have the root element as Category and its value as another JSON with the following information:
Creating the demo Script for testing: After this database is prepared, we can start working on the implementation part. This demo script will present the user with categories of the script. Then as per user input, a list of available scripts will be presented. and then the user input script will be run. The script can be run by the subprocess module. Why not OS? Because in the subprocess module we can define the script entry point and arguments more easily. Refer this article for implementation. Now as soon as the user gives the script choice, our script should use the database to see if it has requirements.txt. If yes then first run: pip install -r $requirements_path/requirements.txt
and then find the entry point of the script to call: python entrypoint.py
. This command can be called via subprocess too with arguments.
Porting this demo to GUI: If this demo script works, then we can port this to a suitable GUI with a menu to select the category, sub-script, and provide arguments if needed. We can also have a checkbox for installing requirements.txt! The GUI can be made in any framework depending upon which contributor takes this sub-issue.
Automating JSON updation: The manual information adding is a one-time task. To add new script information to this JSON, we will use the PR template and GitHub Actions to add update this information. The PR template needs to be updated with the new parameters mentioned in the manual JSON work and every participant will have to compulsorily fill in this information. The GitHub action, on every PR merged, will push this information to this JSON and hence an application will be converted out of this repo! When this task will be performed. all other PRs will be stopped for a while and they will need to update the PR template accordingly.
BONUS task, website updation: The website currently has hardcoded information about the scripts. The JSON generated can be used by web designers to display script information with contributor name, small description, and link to the script. That's why I asked in the PR template this information in the first place! This will help both projects!
As these tasks are themselves way too big, every task will be rewarded level3
. Also, no separate issue will be created for this. All 4-5 PRs will be merged tagging this issue only. The subtasks will be assigned and ticked off in the order mentioned below in this comment only:
Let me know what should be modified, any new ideas except this, or anything else. Also, which task you are interested in.
VERY IMPORTANT INFO: THIS WHOLE ISSUE IS NOT FIRST COME FIRST SERVE
. IF YOU CAN GENUINELY HELP AND HAVE THE SKILLS TO CODE THIS THEN ONLY COMMENT AND DISCUSS.
I have created a separate discussion thread for this. Join here: https://github.com/avinashkranjan/Amazing-Python-Scripts/discussions/882
Hey...Why not make things easy by moving scripts to specific task categories in the repo itself..?This would also make the repository much more accessible based on specificity.
@Ajay-Singh-Rana many scripts can belong to many different categories, not only one.. so doing this will make it even harder to keep track of the scripts when we try to display them according to specificity 🤔
Hey...Why not make things easy by moving scripts to specific task categories in the repo itself..?This would also make the repository much more accessible based on specificity.
We are not aiming to restructure the repo but create a master script to initiate any script of this repo
Finally got free from my commitments.. xd @kaustubhgupta I would like to start with the base of the project of creating the JSON file. Gonna read the each project of the repo and collect the necessary information. I can also add the code snippet of the function which allows the developer to add his/her script to the repo as I will automate this part on my local machine. xd
@XZANATOL Cool, you can start working on this sub-task.
@kaustubhgupta @XZANATOL do you guys mean that the repository would remain as it is and the JSON has to be manually created at first and lateron we can give an option to add the script under a particular section by passing the arguments..? Is it..?
@kaustubhgupta Hi there, I've finsihed most of the JSON file, here is a small part of how it is structured.
The JSON file follows this structure: {category: {name: [path, entry, arguments, requirments_path, contributor, description]}}
Tell me what you think about it, and whether should I change anything about it.
@kaustubhgupta Hi there, I've finsihed most of the JSON file, here is a small part of how it is structured.
The JSON file follows this structure:
{category: {name: [path, entry, arguments, requirments_path, contributor, description]}}
Tell me what you think about it, and whether should I change anything about it.
Perfect, 💯
I am working on the Pull request automation locally and maybe I will create the pull request for that. Let's see how this works and then we can move towards making the demo script!
@kaustubhgupta @XZANATOL do you guys mean that the repository would remain as it is and the JSON has to be manually created at first and lateron we can give an option to add the script under a particular section by passing the arguments..? Is it..?
Yes @Ajay-Singh-Rana. If you want, you can work on the pull request automation.
@kaustubhgupta I would like to work the 2nd task - Creating the demo Script for testing Just have a small question regarding how to deal with scripts that require arguments, should the user first be prompted what argument is required and then take the argument from the user as a command line argument?
Hi @Ayushjain2205, Let @XZANATOL update the database and then I can explain you better about actual implementation
@kaustubhgupta I've a doubt, There are some scripts that needs a requirments.txt
file but it does not exist, however the pip commands present in the Readme.md
file. Shall I make a seperate requirments.txt
file and add the needed for each script (if necessary) or shall I give the path to the Readme.md
file instead? tho doing the second option will prevent adding a step to prepare the script libraries using our menu-driven script. :thinking:
Hi. I created a menu-driven script for another Repo containing scripts related to penetration and Hacking. Would also like to work on this one.
I'll set-up a draft PR in some time, and will continue pushing changes into it
@devRawnie don't make the draft PR now. Read this comment https://github.com/avinashkranjan/Amazing-Python-Scripts/issues/831#issuecomment-817241049 and let me know in which part you're interested.
@kaustubhgupta I've a doubt, There are some scripts that needs a
requirments.txt
file but it does not exist, however the pip commands present in theReadme.md
file. Shall I make a seperaterequirments.txt
file and add the needed for each script (if necessary) or shall I give the path to theReadme.md
file instead? tho doing the second option will prevent adding a step to prepare the script libraries using our menu-driven script. 🤔
No, don't make requirements for those scripts. I create another issue for fixing these. Just include those requirements which are already pushed by contributors
Hi @kaustubhgupta. I would like to work on the second task. The code that I wrote earlier for the menu-driven program loosely followed the methodology as you mentioned here. So I would like to work on the demo script part where we are taking input from the user based on the arguments required. After that, I would also like to learn about github actions and automate JSON updation
Hi @kaustubhgupta. I would like to work on the second task. The code that I wrote earlier for the menu-driven program loosely followed the methodology as you mentioned here. So I would like to work on the demo script part where we are taking input from the user based on the arguments required. After that, I would also like to learn about github actions and automate JSON updation
Okay, for now wait for Json file to be updated.
@XZANATOL updates on this?
Already finished 3 categories, about 4 left. By the way where should I place the JSON file? In the root directory of the repo or shall I make a lazy script
folder?
Already finished 3 categories, about 4 left. By the way where should I place the JSON file? In the root directory of the repo or shall I make a
lazy script
folder?
Umm, yes create a folder for this, "Master Script"
@kaustubhgupta Hi there, sorry for that delay, There is a hidden mess behind each folder in the repo it caused me a headache trying to filter each. xd
The Databse is ready and I'm ready to make a PR (made a git pull
from a couple of minutes to make sure that everything is up-to-date), but one last doubt before submitting everything, there are 6 projects which idk how to implement. (gonna mention them using their folder names)
1) AWS Management Scripts This one has multiple scripts each with it's own job. Any ideas how it should be implemented? I can fill up the needed and at the python-path I will just put a "multiple" value as it indicates that there are multiple scripts there. 2) Spelling-Checker, Data-Visualization Same thing here. 3) Restoring-Divider IDK what this one is here for.. I even read the code but couldn't really understand its purpose. xd There is no Readme.md file. 4) Convert2jpg, Internet-Speed-Test Finally, these 2.. they are just a 2 Readme.md files which tells how to install an external library and use them. (No scripts exist there)
There are .ipybn
beside .py
files in the DB, so consider checking for the extension first to decide whether the Master-Script will run the project using Python shell or Notebook.
1) JSON DB file. Fun fact: the file is 50K characters long. xD
2) A python script that has the code to add & read Projects in DB, & give a report for any unadded projects. I can execlude that from submission, but it will help alot with the upcoming contributions.
@kaustubhgupta Hi there, sorry for that delay, There is a hidden mess behind each folder in the repo it caused me a headache trying to filter each. xd
The Database is ready and I'm ready to make a PR (made a
git pull
from a couple of minutes to make sure that everything is up-to-date), but one last doubt before submitting everything, there are 6 projects which Idk how to implement. (gonna mention them using their folder names)
- AWS Management Scripts This one has multiple scripts each with its own job. Any ideas how it should be implemented? I can fill up the needed and at the python-path I will just put a "multiple" value as it indicates that there are multiple scripts there.
- Spelling-Checker, Data-Visualization Same thing here.
- Restoring-Divider IDK what this one is here for.. I even read the code but couldn't really understand its purpose. xd There is no Readme. md file.
- Convert2jpg, Internet-Speed-Test Finally, these 2.. they are just a 2 Readme.md files which tells how to install an external library and use them. (No scripts exist there)
Note for future contributions
There are
.ipybn
beside.py
files in the DB, so consider checking for the extension first to decide whether the Master-Script will run the project using Python shell or Notebook.What's now Ready
- JSON DB file.
Fun fact: the file is 50K characters long. xD
- A python script that has the code to add & read Projects in DB, & report for any unadded projects.
I can execlude that from submission, but it will help alot with the upcoming contributions.
Good to see the progress till now 😄!
You can skip these projects for now (just mention them in the PR description to create a separate issue for this). Include the python script too, if your script works well then I will guide you about GitHub action and if everything works good, I will assign both tasks (automation PR too) to you (will make separate PR for this, can be discussed later)
@XZANATOL#1411, I might be busy for some days (due to overloaded work) so I would suggest starting the modification of the script you pushed in the above PR. For using Github API, you can refer my this project file.
Secondly, for accessing the PR information refer to this object in PyGitHub library: https://pygithub.readthedocs.io/en/latest/github_objects/PullRequest.html. The naive way of doing this is using GitHub API directly. For example: see this- https://api.github.com/repos/kaustubhgupta/PortfolioFy/pulls/4
Here you will find the "body" key which contains our meta data to be put into the JSON.
One challenge I am facing here is that I am not sure how we will get the PR number as I am not able to get any object for that in GitHub actions documentation. I am still posting the link here: https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions
Let me know if you find the solution for this last thing.
Any doubts about this issue, feel free to discuss it here only (I just checked that you have messaged me on discord but I don't check it regularly)
@devRawnie as the database is now pushed, are you interested to create the demo script?
Yeah sure. Any pre-requisite that I need to take care of?
@devRawnie as of now, I think as you have already made a menu-driven script earlier, you are already aware of the workflow. Create a draft PR after you're done with an initial model and we will see if that matches our requirements. You can refer to this comment again for any confusion: https://github.com/avinashkranjan/Amazing-Python-Scripts/issues/831#issuecomment-817241049
Sure. I am on it
@kaustubhgupta, I was thinking about another approach for this. PR has some defined environment variables which we can make use of. some of them are:
github.event.pull_request.number
gets the PR number.github.event.pull_request.body
gets the body content of the PR.So, I don't think that we need to establish any kind of external connections since what we want already exists in the environment variables. We can test passing them to the script using sys.argv
or using Docker, extract the needed from the body using Regex
, then edit the JSON file easily. (don't know if I'll have to make any tweaks to this part or not if we'll go with Docker solution, need to test first.)
Tell me your thoughts. (and all good considering the Discord thing :joy: I'm mostly active there so you can message me at anytime.)
@kaustubhgupta, I was thinking about another approach for this. PR has some defined environment variables which we can make use of. some of them are:
github.event.pull_request.number
gets the PR number.github.event.pull_request.body
gets the body content of the PR.So, I don't think that we need to establish any kind of external connections since what we want already exists in the environment variables. We can test passing them to the script using
sys.argv
or using Docker, extract the needed from the body usingRegex
, then edit the JSON file easily. (don't know if I'll have to make any tweaks to this part or not if we'll go with Docker solution, need to test first.)Tell me your thoughts. (and all good considering the Discord thing 😂 I'm mostly active there so you can message me at anytime.)
That's perfect, this is what I was expecting. Modify the script for the same and let me know the desired template for the PR. Like what type of format should we adapt so that the Regex part is easy.
Also, as the PR merging is now halted, I want you to verify that the database is up to date at this point, no project is left until your database PR. If they are not covered, add them to the database
ok, I will begin working on it.
.md
file.@kaustubhgupta So I was thinking of creating a main.py file in the root directory of the repository. Running the main.py file would show the options for running different scripts using the datastore.json file as a reference.
@XZANATOL @kaustubhgupta The issue with the current format of the datastore.json is that it only has value "argv" for some cases. Here there is no way to determine the name and purpose of the argument passed. So we would need a format where we have a dictionary object for arguments containing their names, purpose and type (positional, optional) etc. I would like to create a separate PR for editing the JSON file. For now there are less than 10 scripts which take command line arguments, so it is still manageable to update the datastore file
@kaustubhgupta, how should we take care of the requirements of different scripts. Should I run a pip install command for requirements.txt file of a particular script everytime before running that script?
@devRawnie umm actually it doesn't, if you looked at the script, there is a structure that the data base follows xd.
For every project it is:
{Name: [path, entry, arguments, requirments_path, contributor, description], Name2...}
And for the requirments of each topic 🤔 how about making it something like exporting the script into a temp variable then print to the user what to do with it, Either prepare using the requirments.txt or run it? This will help avoid any unecessary command runs each time to run a script.
Also many scripts (I mean most of the repo) don't use arguments. Either it get the inputs using input
method, or from a file.
@kaustubhgupta I have a small doubt, Many contributors have their own style of writing, and many do make typos. What I am pointing here that, suppose I made a custom PR template, and provided the available Categories to put in their script, They made a typo (letter case, or missing letter) this will cause the creation of totally another category, and following on this to be happened acouple of times will literally destroy the Menu-Driven project. xD
Here is a solution. We male the github action trigger on PR comment and check the comment body if it begins.. let's say with a prefix like (add_script) then the contributor manually add the project details. I know it's a manuall work and we want it automated, but we are taking a risk of destroying that 51K character databse. XD
Let me know what you think about it.
@XZANATOL I did see the format of the datastore.json file. And only some 10-11 scripts required command line arguments. What I am having a problem with is that, how to determine the name of the argument and the number of arguments. Because the datastore.json file either stores "argv" or the '-' separated names of the arguments. But it does not give any information about 1. Whether the argument is optional/required 2. Whether the argument is to be passed with argument name or without it.
@XZANATOL Additionally, I did not understand what you meant by "Either prepare using requirements.txt or run it", Because in either case, we would have to install the packages required for that particular script?
@XZANATOL , @kaustubhgupta I made the format of each script like this earlier. Can we merge the argument section with the structure that @XZANATOL made?
Let me clarify the doubts one by one
Hmmm, okay.. let's begin with the '-' seperated arguments. You can let the user print the help menu first.. like python3 script.py --help
to know what the required and optional arguments are. Then you implement 2 things.
1) a basic run of the script.. like passing only one argument
2) allow the user to write a custom command to run the script. You get a string input from the user and pass it to os.system method.
For the argv... Hmm... I think you will need to read the script details first, and then do the same thing as above.
For the requirements topic. What I meant is to use something like a select method, where you export the project from the database to a variable then go on to another menu, where the user have to choose between something like: 1) Run the script 2) Run a custom command 3) Print help menu 4) Required arguments. Etc..
[UPDATE]: Check this comment for progress on this issue: https://github.com/avinashkranjan/Amazing-Python-Scripts/issues/831#issuecomment-817241049
Is your feature request related to a problem? Please describe. Changing directories looking for scripts to use one by one will be quite a headache.
Describe the solution you'd like Making it a menu-driven script, where a user can select the option and the specific script will run (can take arguments too if needed at runtime) in sort like
Lazy Script
if you are aware of.Let me brief you about what you have to do on this issue:
ie. /root/your_filename
while working on githubMarking Guidelines
Level3
, Each Linked PR would be of MinimumLevel2
.@kaustubhgupta @antrikshmisri @santushtisharma10 Kindly Helpout the Contributors working on this particular Issue..As this one is a complex issue..If Any Issues, DM me on Discord..
Happy Coding..👨🏻💻