Open fklemme opened 3 years ago
This is actually something I was investigating/starting to work on setting up myself, at least for #2 (or a variation of it - i.e. using nginx autoindex_format json
output). I'll poke around at it later and maybe toss some thoughts here; putting this comment here to remind myself 😄
i agree & think that local file support is the future of this project - was planning on having a crack at this week.
imo the way to do this while minimising structural changes to the project is to index local files within the current three elasticsearch indexes, which means modifying those ORM tables to support local files and gdrive files. it might also make sense to have a user interface for linking folders to mpc autofill and a second database updater script which crawls local files, accessible with a button press rather than requiring the command line. in that instance, we'd want a django for local mode vs hosted and only enable local file support in local mode.
google drive is baked into mpc autofill fairly deeply so this'd involve a bit of frontend work to handle local files differently to gdrive images as well.
storing local file paths in xml and using them when autofilling should be fairly straightforward.
thoughts?
One challenge (or opportunity) that comes with local files is the docker setup. All docker applications run in an own virtual filesystem that is decoupled from the real system filesystem that the user will see. To have a local folder visible to dango/nginx, the user will have to mount it before starting docker, looking maybe something like this: -v D:\My Proxies:/cards
. So, the local folder D:\My Proxies
will be visible as /cards
to django/nginx. (ofc, we can make a config file or something for the path). The downside of this: The path will look differently for django and for the user. The upside: The path will always be the same (/cards/...
), no matter what directory the user will mount. Maybe we can leverage this! The biggest challenges that comes along is that if django only sees /cards/otto/swamp.jpg
, what do we put in the orders.xml
? A simple solution could be to also serve /cards
with nginx and then have the path become http://localhost:8000/cards/otto/swamp.jpg
and the autofill client will just download the card as usual. Maybe this is the most portable approach. This would even allow this to work on a hosted situation where the host wants to offer local files as well.
In general, I would prefer a solution where the user is able to use Google Drive and local files at the same time, if they like. That wouldn't be a problem, would it?
Last but not least it would be greate if the Google Service Account (client_secrets.json
) is only required if Google Drives are configured in drives.csv
. Not too familiar with django and the code, so maybe this is already the case.
that's a very good point - i dig the idea of optionally pointing the docker image at a single directory to index (and we could set this up such that the docker config modifies a django setting, meaning the project is still usable without docker). it may also be possible to run the autofill script from within the docker image, eliminating the need for downloading the images from the docker image back to the host file system.
definitely planning on retaining google drive functionality alongside local file support!
re: client_secrets
- the file is only required by the drive crawler management script (update_database.py
). once drive files are indexed, the frontend retrieves thumbnail images and supports downloading the full res images without needing to authenticate with google.
Having one folder configured in an env
might be just it. Django can read that, docker-compose can read that (to then mount and re-route django to, e.g., /cards
. Still, it would be good if the serve the images trough staticfiles for portability rather than letting the HTML point to file on the local filesystem. I'm not 100% sure to do this best in a development setup, though.
Oh man, I never even considered putting the client into docker as well. :heart_eyes: Got to look into that. The tricky part would be that docker doesn't offer a GUI natively. So maybe we need the user to enter MPC credentials, run Chrome headless and store the uploaded project (as offered by MPC). Otherwise the user might need something like VNC client to display the Chrome running in Docker. But I will have a look what options are available.
I am only a novice self taught programmer so I don't know much of anything I can offer to help, but if you guys could local folder support, or even a network drive.. that would be absolutely amazing. Now I just need to figure out how to get this running as a docker on my unRAID server haha.
(Also, the URLs shown in the xml file in the video are internal to my LAN only, so don't bother typing them out)
(edit, in case the URL above is broken because thanks GitHub: https://www.youtube.com/watch?v=piI_EMZVgZs)
in terms of the local tool, i think the way to go will be storing local files' paths in the <id>
tag in xml - it'd be difficult to determine whether an image is from google drive or is locally stored from xml without changing the xml schema. adding support for this in my local tool rewrite branch https://github.com/chilli-axe/mpc-autofill/tree/local-tool-rewrite remote isn't up to date atm but i'll push my changes shortly. e.g. this is working for me:
<order>
<details>
<quantity>12</quantity>
<bracket>18</bracket>
<stock>(S30) Standard Smooth</stock>
<foil>false</foil>
</details>
<fronts>
<card>
<id>G:\Google Drive\Chilli_Axe's MTG Renders\0. White\Academy Rector.png</id>
<slots>1,2,0</slots>
<name>Academy Rector.png</name>
<query>academy rector</query>
</card>
<card>
<id>G:\Google Drive\Chilli_Axe's MTG Renders\1. Blue\1. Search for Azcanta.png</id>
<slots>3,4</slots>
<name>Search for Azcanta.png</name>
<query>search for azcanta</query>
</card>
<card>
<id>G:\Google Drive\Chilli_Axe's MTG Renders\6. Colourless\All Is Dust (Secret Lair).png</id>
<slots>5,6,7,8,9,10,11</slots>
<name>All Is Dust (Secret Lair).png</name>
<query>all is dust</query>
</card>
</fronts>
<backs>
<card>
<id>G:\Google Drive\Chilli_Axe's MTG Renders\1. Blue\1. Azcanta, the Sunken Ruin.png</id>
<slots>3,4</slots>
<name>Azcanta, the Sunken Ruin.png</name>
<query>azcanta sunken ruin</query>
</card>
</backs>
<cardback>G:\Google Drive\Chilli_Axe's MTG Renders\12. Cardbacks\Black Lotus.png</cardback>
</order>
(Also, the URLs shown in the xml file in the video are internal to my LAN only, so don't bother typing them out)
(edit, in case the URL above is broken because thanks GitHub: https://www.youtube.com/watch?v=piI_EMZVgZs)
nice work! the local tool in master is a truly horrible piece of code so i'm sorry about that but i'm improving i swear 😅
<card> <id>G:\Google Drive\Chilli_Axe's MTG Renders\0. White\Academy Rector.png</id> <slots>1,2,0</slots> <name>Academy Rector.png</name> <query>academy rector</query> </card>
Will the client still "download" the files to a cards sub-folder? Please keep in mind that in practice, people might have deep folder structures containing duplicate filenames. Thus, some kind of ID should still be appended to the filename. (Maybe just a hash, in case of path?) Will you be extending the schema in your rewrite branch? (didn't quite get that)
in my branch, the logic is now:
parsing and validation step:
<id>
tag points to a valid file, use this as the image's file path/cards
to see if the image exists without the drive ID in parentheses - this has the potential to cause file name collisions, but is what allows the tool to work when using the Download All
button in the web app and moving those files to /cards
- if a file exists at that path, use this as the image's path/cards
using the file's gdrive ID in parenthesesdownloader threads:
so local files are uploaded to mpc directly from the paths they already exist at (they aren't copied to /cards
)
not sure if i explained that clearly sorry but hopefully it makes sense!
edit: i had to squash some commits bc my local git is a bit cooked (painful to use gitkraken and a private github email address) but this is up to date now https://github.com/chilli-axe/mpc-autofill/tree/local-tool-rewrite/autofill not done yet but most of the way there
Sorry, I'm not that familiar with the codebase: Wouldn't it be worth the effort to add another optional tag (e.g., path
) to be explicit and avoid future confusions? Would that actually require many changes?
you're probably right yea - i suppose it's not a problem if some cards don't have a <path>
tag! will think about it more but i'll probably end up doing this
one reason it's slightly nicer to do this w/ the id tag is the common cardback (stupidly, i shouldn't have designed the xml schema this way) only has a single text field, and it's more consistent to assign that text to drive_id
and go from there - it's probably a bad move to make a breaking change to the schema at this point
I appreciate your consideration of backwards compatibility. However, now that you're rewriting the client anyway would be the best time to fix these things. Also, I believe most people also won't get into trouble with a change because they can a) still grab an older release or b) will use your web application anyway, in which we can apply the same changes. So I won't be too defensive when aiming for a good and future-oriented change. :)
I'm also working on some ideas to bundle the client with docker. This might further reduce the chance of picking incompatible tool versions. I might come back to you with some Django-related questions, because I'm considering to add a button to launch the client directly from the web interface, and I'm not too familiar with Django / web development.
Oh dear, looks like there's several cooks in the kitchen on this 😅
I'll have a look later at adding HTTP download support to the rewritten autofill tool. I've also started on abstracting sources to allow for different parameters for different source types (like drive id and drive link, or URL, or local file path) on the web side of things, but I'm not 100% sure I'm going about it the right way yet. I'll try to get some code up within the next few days if you wanted to take a look!
getting there with this feature! hoping to have a pr up before too long https://github.com/chilli-axe/mpc-autofill/tree/local-file-support
Wow, that's a lot of new code! :smile: Just one question so far. With adding local files to the static files like this:
LOCAL_FILE_INDEX = r"" # for example: r"C:\Users\John Doe\Desktop\MPC Cards"
# [...]
STATICFILES_DIRS = [
os.path.normpath(os.path.join(BASE_DIR, "cardpicker/static")),
os.path.normpath(LOCAL_FILE_INDEX),
]
Does this mean, when I call python3 manage.py collectstatic
, all cards will be copied?
yes - it works fine in development but will require some more thought for dockerising since you're serving static files with nginx. a few ideas:
collectstatic
to collect all files that aren't in LOCAL_FILE_INDEX
to the django static directory, then nginx should serve that directory as well as LOCAL_FILE_INDEX
on /static
- might that be possible? for what it's worth, collectstatic
has an optional argument to ignore files matching a pattern but i can't seem to force it to ignore my local file index.LOCAL_FILE_INDEX
in a text file in the base directory so you don't need to modify django settings to configure the local index - this might open up the possibility of copying the value from the configuration file into django settings after running collectstatic
?update_database
script could create low resolution thumbnails of all images in LOCAL_FILE_INDEX
and django could serve these rather than the full res images, then copying the thumbnails with collectstatic
would be less costly than copying the full resolution images. might still require a non-trivial amount of storage space though and i haven't tested how long creating thumbnails in this way might take.open to any suggestions on this!
I was also thinking of the first option intuitively. It should be very easy to implement in Docker, so we should just try this one first. The second option would work just as well but I don't think it's necessary. The third option sounds interesting for performance reasons. But we should first try the simple way before we blindly optimize for something that might not be necessary in the first place.
I will check out the branch soon and give it a try. Then I'll also see if there are other things that we'll need to consider.
Hey there, thx for working on this guys !
Just a small input here, (but I might be too late to the party): I would recommend to having 2 steps process: step 1: install and deploy MPCAutoFill server with an "empty image database" step 2: Run program / script to Scrap image_urls and fill the database from a third party (either google drive, static-file-server, etc ... )
The main benefits are:
The only downside is that if someone update files on the third party, scrapper need to be re-ran.
For static file serving, as you guys are already using docker, I would recommend using something like this: https://hub.docker.com/r/halverneus/static-file-server
For people that already have directories with only the images they want to use, and don't want to have to use the web interface, I hacked together a script. It can load front images from one directory, and back images from another.
It does require some editing, and manually setting the slot
of backs of cards after the XML is generated.
https://gist.github.com/rsullivan00/df968d764101a84244b4b1a06caecf79
While Google Drive is a nice interface in a distributed setup, for the time being, it would be helpful to have some alternative for the local execution of the web application. Without a collection of Google Drives available, it would be helpful if own files could be offered easily without the need for a (large, paid) Google Drive. I could imagine having a local folder as an additional source of images or even some other web-space. While I would like to help and contribute in developing such a feature, I think this topic requires the expertise of @ndepaola to judge what is doable and reasonable. Maybe we can do some brainstorming here. I was thinking of a few possibilities:
autofill.exe
for MPC upload.Anyway, I think all methods would require to add an additional field to the
drives.csv
and go from there.What do you think? Is this something worth targeting? Doable in reasonable effort? Or is this rather a different project altogether?