chavoosh / ndn-mongo-fileserver

An NDN fileserver based on MongoDB
GNU General Public License v3.0
5 stars 2 forks source link

Current version for gdrive fetcher #14

Open wenkaizheng opened 4 years ago

wenkaizheng commented 4 years ago

Some prerequisite for running my script, please follow this link below step by step. https://developers.google.com/drive/api/v3/quickstart/python In the step1, you will get a json file as a developer account configuration, and my script need to run with this json file. I recommend you not to use your personal account if you want to publish this json file in the GitHub because other people may develop apps as the account holder once they have this json file. Maybe we can register a dummy google account which does not include any personal information. Name the josn file "credapi.json" in cms folder. Let me know if you have any problem.

wenkaizheng commented 4 years ago

If one video has the problem with encode process, do we just ignore this file and log all error message?

chavoosh commented 4 years ago

General comments: There is no instruction on installing/configuring and running the application. For example, I see you are using several non-built in python packages (e.g., dateutil), but you did not provide any instruction to install them firt (look at https://github.com/chavoosh/ndn-mongo-fileserver#prerequisites) and how all prerequisite commands are listed.

I do not see any documentation about the features and tasks (either done or on-going) in pull request's description.

Please make a separate pull request for unit-tests.

This commit includes 500+ lines of code (much more than I expected). Usually, reveiwing a lengthy commit is not a good idea, but I hope I can finish it without missing any point, ASAP.

chavoosh commented 4 years ago

Please apply the first comment, ASAP.

wenkaizheng commented 4 years ago

This script goes through all file in a specifc google drive account and connects with ndn-mongo-fileserver, if there is any change from google drive e.g. update, delete, insert the script will handle it. Insert: When an user inserts a new file into his folder, this script will create an user folder and download this file in the disk. And then runs the scripts in the video folder for encoding, packing, and chunking this file, and eventually generates a html file. Delete: When an user deletes a file in his folder, this script will delete all corresponding file such as encoded file, package folder, and html file in the disk. Update: When an user modifies the name of file, this script will rename all corresponding file such as encoded file, package folder, and html file in the disk.

Also this script provide two different mode which are info mode and debug mode. Either one will generate file_record.txt as a log file to check the status of each file e.g. initial, download, encoded, packaged, chunked, html and js first, html and js second, deleted. Those status will allow developers to check the working process, and when the script crash in any time with any reason, you can check what is the previous process. The difference between the info mode and debug mode is that debug mode also contains the output from video scripts. This script also generates the binary file data-s which is used for recording status of file as well, but it can load and dump with any data structure. More specifically, data-s is about data serialization which allows the script to check the status of each file very quick, once the script knows the status of file, it can resume to work from previous process even the script crash in the middle.

Note: For checking the chunk process with MongoDB is not done yet, I thought we talked this some function about delete rename file in MongoDB before. Also if a file is failed with any process in middle, do we need to do anything else instead of just record it in log file?

Some prerequisite for this script: 1.Install gdrive from this link: https://github.com/prasmussen/gdrive 2.Install google authenticate module and enable google drive api from this link: https://developers.google.com/drive/api/v3/quickstart/python 3.After you done with step 1, you will get a json file, and please name it "credapi.json", and also make sure it is in the cms folder. 4.Install dateutil module: pip install python-dateutil

chavoosh commented 4 years ago

This description is barely what I asked for. I referred you to https://github.com/chavoosh/ndn-mongo-fileserver/blob/master/README.md to learn what to put in the readme of the application. But I do not see such a file. I expected to see a Readme.md file under /cms directory, with well-formated and easy to follow instructions.

python-dateutil is not the only non-built-in package you used in the application. I had to spend an hour to fix the dependencies of this application -- which is not a good sign, as no one else spend this time. Apparently, you did not document/track the packages that you have installed while you were working on your script.

Just as an example, users have to install google_auth_oauthlib.flow and upgrade their setuptools and also resolve the mismatch of urllib3 and any other resolver packages (e.g., chardet) to be able to only run this script.

I want you to make a complete, step-by-step, and easy-to-follow instruction in a Readme file.

Meanwhile, the informtion you provided in the description, although is useful, is not organized. It does not have any structure. Also, please include the tasks and features, too.

chavoosh commented 4 years ago

Another comment. I do not know how to check the application when it runs. For example, how can I specify the folder under which I am going to share videos? How can I pass the name of the directory to the application? Given there is a directory and some videos in it, what should I see on the application's side? Should I check the log files, mongo db, the actual kernel-level process?

wenkaizheng commented 4 years ago

Did you visit the link I provide about google drive api? I think the step2 include all commands for package you need to install.

wenkaizheng commented 4 years ago

When you run this application, you will see a new folder will be created in working directory, and video file and encoded files, package folder, html file should all in there.

chavoosh commented 4 years ago

Did you visit the link I provide about google drive api? I think the step2 include all commands for package you need to install.

My bad. I missed that part for google-api packages. Just a suggestion, in the Readme when you refer the readers to follow https://developers.google.com/drive/api/v3/quickstart/python, it is good to specify STEP 1 & STEP 2. At least for my case, I did not follow all steps because I saw most of them are running a sample.

chavoosh commented 4 years ago

When you run this application, you will see a new folder will be created in working directory, and video file and encoded files, package folder, html file should all in there.

By working directory you mean the directory I am running ndn_script.py in?

wenkaizheng commented 4 years ago

When you run this application, you will see a new folder will be created in working directory, and video file and encoded files, package folder, html file should all in there.

By working directory you mean the directory I am running ndn_script.py in?

yes

chavoosh commented 4 years ago

Still, I do not know what folder the script checks for the videos. I mean in google drive.

wenkaizheng commented 4 years ago

Still, I do not know what folder the script checks for the videos.

I think the script will check all folder in google drive, and if there is any folder has change, it will reflect in the current working directory.

chavoosh commented 4 years ago

I think the script will check all folder in google drive, and if there is any folder has change, it will reflect in the current working directory.

Well this is not a good design. There should be a way to pass the name of a directory (or directories) to the script so the script only checks those folders. Can you update the code?

Sure. Does name of directory mean the name in the google drive? Also what about a developer does not give any name of directory? Should this script check the all folder?

chavoosh commented 4 years ago

I guess you edited my message!!

wenkaizheng commented 4 years ago

I guess you edited my message!!

I think I only edit the part I reply.

chavoosh commented 4 years ago

I guess you edited my message!!

I think I only edit the part I reply.

Yes. I meant you can quote what I said if you need to reply. This makes the thread more clear for future references.

wenkaizheng commented 4 years ago

I guess you edited my message!!

I think I only edit the part I reply.

Yes. I meant you can quote what I said if you need to reply. This makes the thread more clear for future references.

Sure

wenkaizheng commented 4 years ago

I think the script will check all folder in google drive, and if there is any folder has change, it will reflect in the current working directory.

Well this is not a good design. There should be a way to pass the name of a directory (or directories) to the script so the script only checks those folders. Can you update the code?

Sure. Does name of directory mean the name in the google drive? Also what about a developer does not give any name of directory? Should this script check the all folder?

Can u please answer those questions?

chavoosh commented 4 years ago

Sure. Does name of directory mean the name in the google drive?

Yes. Let's say the folder I want to put my videos in is called ivisa_videos then there should be a way I can tell the application that it should only check that directory.

Also what about a developer does not give any name of directory? Should this script check the all folder?

In the input, the script should prompt the name of the directory that is supposed to scan. You can also make an argument to pass it as command line arg. Something like: $ ndn_script.py -d ivisa_videos

If the directory is not passed to the application, the app MUST ask for it in the command line or reject running.

In a real scenario I will create a directory and put the directory/folder of all users under that.

wenkaizheng commented 4 years ago

Sure. Does name of directory mean the name in the google drive?

Yes. Let's say the folder I want to put my videos in is called ivisa_videos then there should be a way I can tell the application that it should only check that directory.

Also what about a developer does not give any name of directory? Should this script check the all folder?

In the input, the script should prompt the name of the directory that is supposed to scan. You can also make an argument to pass it as command line arg. Something like: $ ndn_script.py -d ivisa_videos

If the directory is not passed to the application, the app MUST ask for it in the command line or reject running.

In a real scenario I will create a directory and put the directory/folder of all users under that.

So this script just monitor one folder in the google drive each time? Can input be a list of name of folders instead?

chavoosh commented 4 years ago

So this script just monitor one folder in the google drive each time? Can input be a list of name of folders instead?

No. Just one folder.

wenkaizheng commented 4 years ago

So this script just monitor one folder in the google drive each time? Can input be a list of name of folders instead?

No. Just one folder.

Ok, I will start to update the code.