madMAx43v3r / chia-gigahorse

221 stars 32 forks source link

PLOTS WON'T COPY FROM -T to -D #115

Open NewSunSEO opened 1 year ago

NewSunSEO commented 1 year ago

Hello, I have added an Intel DC P3700 1.6TB Solid State Drive SSD PCI Express 3.0 x4 to my R720XD server. There are two pools with 6 Seagate Exos X20 ST20000NM007D 20TB 7.2K RPM SATA 6Gb/s in each pool. Before adding the P3700, I was plotting to the same -D directory as the -T. I have run sudo chmod 0770 on the two mounts for the -D & the -T. The plots are being created in the -T but they are not moving to the final -D. I"m really new to Proxmox, TrueNAS & Chia Farming. If anyone has any insight about where to look to fix this issue, I would really appreciate it. I'm guessing there might be a permission issue somewhere, but I'm not sure where to look yet.

Thank you

madMAx43v3r commented 1 year ago

it's -t and -d, lower case

NewSunSEO commented 1 year ago

Yes, I use lowercase in the actual script. The UPPERCASE is part of my naming convention for my organization... I use UPPERCASE for important things. This is how my script actually looks:

CUURENT - Since the plots aren't moving, I went back to using this the same directory for the -t and -d: ./cuda_plot_k32 -n -1 -C 8 -S 4 -t /mnt/pool1/ -d /mnt/pool1/ -c xxx -f xxx

WITH Intel DC P3700, but the plots aren't moving from the -t to the -d. I tried moving them with a script & that did not work either. So I'm thinking I have some sort of permission issue somewhere. I'm hoping someone else experienced this & can suggest where to work on this.
./cuda_plot_k32 -n -1 -C 8 -S 4 -t /mnt/ssd_backup/ -d /mnt/pool1/ -c xxx -f xxx

madMAx43v3r commented 1 year ago

Any errors in the terminal? That's very strange..

NewSunSEO commented 1 year ago

No, none. It just keeps running & filling up the -t drive. Yeah, both the -t and -d are write-able, but I am not able to move the plots from -t to -d for some reason...

madMAx43v3r commented 1 year ago

What filesystem is /mnt/pool1/ ? maybe it reports zero free space?

NewSunSEO commented 1 year ago

/mnt/pool1/ is an NFS on TrueNAS. Maybe this is something with the permissions of the NFS share...

madMAx43v3r commented 1 year ago

If it's a permission error you should see messages about "failed to copy ... "

NewSunSEO commented 1 year ago

I will stop the current plot creation using the same mount points & try this again & post any errors in terminal.

Thank you,

NewSunSEO commented 1 year ago

I let it run for a bit. It is on its 6th plot. I'm getting plot times tonight around 12 minutes on a 3060 Ti, with 6 new 20TB Seagate Exos drives in pool1. The plots are not moving to /mnt/pool1/ and I am not getting any errors.

NOT-MOVING-PLOTS

NOT-MOVING-PLOTS-02

madMAx43v3r commented 1 year ago

image You never noticed this message? It's an issue with free space detection, your -d is reporting zero free space ... or it's really full

madMAx43v3r commented 1 year ago

Let's see df -h output

NewSunSEO commented 1 year ago

I did not catch that message, but it is not accurate. This is the destination pool. It was full when it was just two drives, but then I expanded that pool & added four more new drives:

image

NewSunSEO commented 1 year ago

NOT-MOVING-PLOTS-04

NewSunSEO commented 1 year ago

pool1 was full around 37T. It got to 42T when I was plotting to it for both the -t and -d

NewSunSEO commented 1 year ago

I am trying to plot to pool2 that is a new pool with all new drives:

./cuda_plot_k32 -n -1 -C 8 -S 4 -t /mnt/ssd_backup/ -d /mnt/pool2/ -c xxx-f xxx

madMAx43v3r commented 1 year ago

there is no chia_plot_sink_disable(.txt) file in /mnt/pool1/ ?

NewSunSEO commented 1 year ago

I did not create anything like that myself. Does your software create that?

madMAx43v3r commented 1 year ago

No, it's a manual feature, mostly used to avoid to write to the root filesystem when disks are not mounted.

NewSunSEO commented 1 year ago

Oh ok. I have scripts that auto mount the mount points at startup. Should I add something for this - chia_plot_sink_disable(.txt) ?

madMAx43v3r commented 1 year ago

it's strange that writing to pool1 as -t works but not for -d, the free space check is the same in both cases ...

madMAx43v3r commented 1 year ago

Should I add something for this - chia_plot_sink_disable(.txt) ?

It's not needed for linux, since it will fail due to permissions anyways.

madMAx43v3r commented 1 year ago

Maybe it does copy, but it's super slow and the first copy takes forever and you didn't notice?

NewSunSEO commented 1 year ago

The Intel DC P3700 in -t is a fairly fast SSD. I left the plotter for several hours sitting, not writing plots & then came back to it. The plots did not move from -t to -d at all. I stopped it with around 6 plots on it.

madMAx43v3r commented 1 year ago

Are there any *.plot.tmp files on pool1 ? If yes what size are they?

NewSunSEO commented 1 year ago

There have been, but whenever I see any, I remove them from there.

madMAx43v3r commented 1 year ago

so it's trying to copy right? dont remove them while it's copying.

madMAx43v3r commented 1 year ago

try again and check the size of these files, to see copy progress

madMAx43v3r commented 1 year ago

I suspect the copy operation is super slow, for some reason

NewSunSEO commented 1 year ago

The two mount points went offline. I made the plotter VM 400 TiB of RAM, mounted /mnt/ramdisk/ and then added that to my mount script & restarted the whole server. Something knocked my two pools offline. I'm trying to fix them. I will try creating 5 plots again as soon as I have this fixed.

Thank you for your help

NewSunSEO commented 1 year ago

I am new to Linux & I know this is outside supporting your software, but quickly, does this look to you like I lost both pools & likely need to recreate them? I know I'll lose the existing plots if I do that...

NOT-MOVING-PLOTS-05

NewSunSEO commented 1 year ago

I just deleted the pools & recreated them new. I created 3 new plots. I think you're right, they are copying, but the process is going really slowly... If you have any ideas what could be causing this, please let me know....

Thank you for your help,

NewSunSEO commented 1 year ago

The plots were moving really slowly, but then the CLI disconnected from being inactive for 5 minutes & the copying process stopped when it disconnected...

NOT-MOVING-PLOTS-06

NewSunSEO commented 1 year ago

I deleted all the previous plots I had & started over. The disks were empty, they were definitely not full....

madMAx43v3r commented 1 year ago

I have no idea about TrueNAS sorry, try to ask for help on my discord https://discord.gg/BswFhNkMzY

NewSunSEO commented 1 year ago

Hello madMAx43v3r, I've made some progress & have around 170TiB farming so far, but still having an issue. I changed the Intel P3700 NVME drive, from being a share in TrueNAS, to being directly pass-through to the plotter VM. I am getting 2.5 minute plot creations now, but they are not copying from -t to -d. Both -t & -d are writable. I can create plots directly to both the -t or -d directories. But at this point, the plots just stay in -t and they do not move to -d. This is what I am using, testing creating 3 plots at a time right now:

./cuda_plot_k32 -n 3 -C 8 -S 4 -t /mnt/plots/ -d /mnt/pool1/ -c xxx -d xxx

I can't find anything else so far to help with this.

Thank you,

NewSunSEO commented 1 year ago

ls -l /mnt/plots total 224319804 -rw-rw-r-- 1 chia chia 76594369368 May 31 08:28 plot-k32-c8-2023-05-31-08-22-f7c6751af4f53a20f4400ea5e998b5eaf14203e95ba1a0222b9ef4c67d430606.plot -rw-rw-r-- 1 chia chia 76585876024 May 31 08:31 plot-k32-c8-2023-05-31-08-28-48870c7d3c49e4396b5101a7c899c440aba8cd8fb1b4143427a1cf419d59dbf9.plot -rw-rw-r-- 1 chia chia 76523226576 May 31 08:34 plot-k32-c8-2023-05-31-08-30-0a67fc25946775c815196b345c274f16098f0224a70b0030a60ec95ba4596194.plot

NewSunSEO commented 1 year ago

-rw-rw-r-- 1 chia chia 76594369368 May 31 08:28 plot-k32-c8-2023-05-31-08-22-f7c6751af4f53a20f4400ea5e998b5eaf14203e95ba1a0222b9ef4c67d430606.plot -rw-rw-r-- 1 chia chia 76585876024 May 31 08:31 plot-k32-c8-2023-05-31-08-28-48870c7d3c49e4396b5101a7c899c440aba8cd8fb1b4143427a1cf419d59dbf9.plot -rw-rw-r-- 1 chia chia 76523226576 May 31 08:34 plot-k32-c8-2023-05-31-08-30-0a67fc25946775c815196b345c274f16098f0224a70b0030a60ec95ba4596194.plot

NewSunSEO commented 1 year ago

That was the -t directory /mnt/plots. The -d directory has the same permissions set.

NewSunSEO commented 1 year ago

I formatted the NVME like this: sudo mkfs.ext4 /dev/nvme0n1. Maybe it needs to be a different format?