adriantich / DnoisE

Distance denoise by Entropy
GNU General Public License v3.0
12 stars 3 forks source link

.csv input errors: "key error 'id'" and length of data #10

Closed SanniH closed 3 years ago

SanniH commented 3 years ago

Hello again!

So I went to run DnoisE with a .csv input and output, but got a similar error to when I used a fasta, where somewhere along the code it doesn't recognize my naming anymore. So I decided to try to run my csv inputs with the name "id" for the otu/esv identifier column as the error seems to point to that direction, and it ran smooth until it began to write the output dir, when it ran into an issue of "ValueError: Length of passed values is 57, index implies 56.". My .csv contains a total 55 columns, with sample columns starting at 4 and ending at 55. Not sure what happened there along the way... This happens with all of my datasets, where the input ncol is one less than the index value implied, and the length of passed values is 2 more than the original ncol.

I've attached a zip file containing my updated .csv (the one with "id" as id entifier col name), the error for when I got the id key error, and the full output of my batch job I get when DnoisE starts to write the outputs. I'm hoping its something as simple as me not saving my csv file right!

csv_input.zip

The key error is not that big a deal, but I thought I'd add it here just in case you want it to be customisable for users, though personally if your future how-to guide tells users to have it named "id" then that solves that problem! :)

adriantich commented 3 years ago

Please let me check! I'll answer soon!

A.

El jue., 22 abr. 2021 13:10, SanniH @.***> escribió:

Hello again!

So I went to run DnoisE with a .csv input and output, but got a similar error to when I used a fasta, where somewhere along the code it doesn't recognize my naming anymore. So I decided to try to run my csv inputs with the name "id" for the otu/esv identifier column as the error seems to point to that direction, and it ran smooth until it began to write the output dir, when it ran into an issue of "ValueError: Length of passed values is 57, index implies 56.". My .csv contains a total 55 columns, with sample columns starting at 4 and ending at 55. Not sure what happened there along the way... This happens with all of my datasets, where the input ncol is one less than the index value implied, and the length of passed values is 2 more than the original ncol.

I've attached a zip file containing my updated .csv (the one with "id" as id entifier col name), the error for when I got the id key error, and the full output of my batch job I get when DnoisE starts to write the outputs. I'm hoping its something as simple as me not saving my csv file right!

csv_input.zip https://github.com/adriantich/DnoisE/files/6357770/csv_input.zip

The key error is not that big a deal, but I thought I'd add it here just in case you want it to be customisable for users, though personally if your future how-to guide tells users to have it named "id" then that solves that problem! :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/adriantich/DnoisE/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASBYEQDP7Q4EUIWRCIKA4GDTJ7727ANCNFSM43MHNYZA .

adriantich commented 3 years ago

Hello Sannis! I think it's fixed now. I've tried to run with your data and worked well. Please check also the installation of the pandas module on your computer. In 'slurm-931755.out' line 21 there is a warning concerning this matter. ..../site-packages/pandas/compat/init.py:120: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError. The problem was not that but just in case.

Thanks a lot for your comments. It is really helpful!

Adrià Antich

SanniH commented 3 years ago

Hi,

Perfect I will run it again first thing tomorrow. Thanks for being so quick to respond! :)

I saw that "lzma" error before, will see if I can get it sorted. I'm running DnoisE installed locally in my personal directory on a cluster using a globally installed version of python, so updating it/related packages is more complicated than it really should be... Hoping it won't interfere with this though because I havent had a runtime error yet!

Cheers, Sanni

On Thu 22 Apr 2021, 20:25 adriantich, @.***> wrote:

Hello Sannis! I think it's fixed now. I've tried to run with your data and worked well. Please check also the installation of the pandas module on your computer. In 'slurm-931755.out' line 21 there is a warning concerning this matter. ..../site-packages/pandas/compat/init.py:120: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError. The problem was not that but just in case.

Thanks a lot for your comments. It is really helpful!

Adrià Antich

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adriantich/DnoisE/issues/10#issuecomment-825124313, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARJ47OIJVQCYJPJA5DEIBHTTKBZ3TANCNFSM43MHNYZA .

adriantich commented 3 years ago

Hello Sanni, I have done a last update now. If you have git pull please update again. It is not a major change but just in case ;) Tell me if runs well or not please!

Cheers, A.

SanniH commented 3 years ago

Hi,

It ran smooth, no issues :)

Thanks!! Sanni